IBM Support

How can I use external C++ code with DataStage as a transformer function (PX Routine)

Question & Answer


Question

How can I use external C++ code with DataStage as a transformer function (Parallel Job Transform Routine - see as PX Routine)

Answer

PX Routine Tips and Tricks

The PX Routine supports two types of external object linking.


Object Type:
Choose Library or Object. This specifies how the C function is linked in the job. If you choose Library, the function is not linked into the job and you must ensure that the shared library is available at run time. For the Library invocation method the routine must be provided in a shared library rather than an object file. If you choose Object the function is linked into the job, and so does not need to be available at run time. Note that, if you use the Object option, and subsequently update the function, the job will need to be recompiled to pick up the update. <From 7.5.2 help screen>

Notes:
-Some compilers require that the source code extension be "C" not "c". "C" depicts a c++ compile which is required for linking into DataStage.
-Make sure you are using the SAME compiler and options to compile your code that are defined in the administrator in APT_COMPILER/APT_COMPILEOPT and APT_LINKER/APT_LINKOPT, this should be the native compiler and options set by the installer.

This first example was done on RHEL 4 OS and dsenv was sourced to set environment to be DataStage aware.


Steps to use Object code: (Simplest)

1. Compile the external C++ code with -c option:

g++ -c myTest.C -o myTest.o

2. Add a new PX Routine in Designer.
-Routine Name: This is the name used in the Transformer stage to call your function
-Select Object Type
-External subroutine name: This is the actual function name in the C++ code
-Put the full path of the object in the routine definition
-Return Type: Match this datatype to the actual return type of your C++ function
-Arguments: create any arguments that are required by your external C++ function

3. Create a job with a transformer that calls your routine , Compile the job and run.



Steps to use Library option: (more complex but allows for linking in other libraries)

1. Compile the code with the -shared option:

g++ -shared myTest.C -o libmyTest.so ( notice the library must begin with "lib" )

2. Same as Step 2 From Object above, except:
-Select library option
-use the new libmyTest.so for library name.
-you are required to put the new shared object (libmyTest.so) in a directory in the Library Path:
LD_LIBRARY_PATH or LIBPATH or SHLIBPATH depending on your OS.


3. Compile the job and run


Example Code....

int my_funct(int x)

{
return x+1;
}

$ g++ -c myTest.C -o myTest.o

$ g++ -shared myTest.C -o libmyTest.so

-rwxrwxr-x 1 dsadm dsadm 4477 Feb 17 14:40 libmyTest.so

-rw-rw-r-- 1 dsadm dsadm 699 Feb 17 14:40 myTest.o

Notice the size and permission difference.

For Solaris you are required to link to get the shared object....

$ /opt/SUNWspro/bin/CC -dalign -O -PIC -library=iostream -c myTest.C -o myTest.o
$ /opt/SUNWspro/bin/CC -G myTest.o -o libmyTest.so

First command gets you the object file.
Second gets you the shared object.

$ ls -l libmyTest.so myTest.o
-rwxr-xr-x 1 dsadm dstage 4064 Feb 17 17:54 libmyTest.so
rw-rr- 1 dsadm dstage 820 Feb 17 17:54 myTest.o



Advanced:

You can link to other libraries to open up the functionality of your routine to call internal DataStage functionality.

For example to include the functionality of dsapi.h you can use the "library" method listed above except in the linking step you include the required library.

This example was compiled on a Solaris OS.

Example Code: (myProjects.C)

#############CUT#######################
#include<stdio.h>
#include<dsapi.h>

char* projects()
{
char* prlist;
prlist=DSGetProjectList();
return prlist;
}
############PASTE#########################

/opt/SUNWspro/bin/CC -I/opt/IBM/InformationServer/Server/DSEngine/include -dalign -O -PIC -library=iostream -c myProjects.C -o myProjects.o
/opt/SUNWspro/bin/CC -L/opt/IBM/InformationServer/Server/DSEngine/include -lvmdsapi -G myProjects.o -o libmyProjects.so

Copy libmyProjects.so to a path in your library path:

cp libmyProjects.so $APT_ORCHHOME/user_lib <note: you may have to create this directory>



You will need sufficient skills in compiling and linking for your environment. This document is meant as a guide to get you started in understanding the requirements external to DataStage or Information Server products.

[{"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"8.7;8.5;8.1;8.0;7.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21398620