Configuring InfoSphere DataStage so that jobs can be submitted by multiple users

To monitor and track jobs that are submitted by individual users, you must update configuration settings on your instance of Hadoop, as well as environment variables on your InfoSphere® Information Server installation.

Procedure

  1. If you are not using a Kerberos-enabled cluster, update the following settings for your instance of Hadoop:
    • Set the yarn.nodemanager.container-executor.class variable to org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.
    • Set the yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users variable to false.
    If you have a basic Apache Hadoop installation, you can modify this setting in the yarn-site.xml file. If you are using other versions of Hadoop, such as the IBM Open Platform or the Cloudera Distribution Hadoop (CDH), you can search for this setting by using the cluster management user interface.
  2. On the computer where the engine tier is installed, open the yarnconfig.cfg file and set the APT_YARN_MULTIPLE_USERS environment variable to true.
    The yarnconfig.cfg file is located in the following default directory: /opt/IBM/InformationServer/Server/PXEngine/etc/yarn_conf.
    1. Verify that each user that will run InfoSphere DataStage® jobs has a user home directory in HDFS.
      For example, the user dsadm needs to have the following home directory in HDFS: HDFS directory /users/dsadm.
  3. If you are using a Kerberos enabled cluster, perform the following steps:
    1. Verify that a Kerberos principal exists for each user that will run InfoSphere DataStage jobs.
      Note: The principal user is the one that is used to launch processes within the cluster. The principal cannot be shared between InfoSphere DataStage users.
    2. Verify that you have run the kinit command to obtain the Kerberos ticket-granting system on the cluster that all InfoSphere DataStage users will run jobs on.
  4. Restart the YARN client.
    For example, $APT_ORCHHOME/etc/yarn_conf/stop-pxyarn.sh and $APT_ORCHHOME/etc/yarn_conf/start-pxyarn.sh.

Results

Log files for jobs run by individual users are located in the following directories:
  • /tmp/yarn_client.$user_name.out
  • $APT_ORCHHOME/etc/logs/yarn_logs/yarn_client.$user_name.out