IBM Support

DataStage BDFS stage gets error 255 connecting to remote hadoop/hdfs server

Troubleshooting


Problem

DataStage job with the Big Data File System (BDFS) stage fails with following error when attempting connection to remote hadoop file system via local BigInsights client: Message: BDFS_0,0: Unable to connect to hdfs host myhost.domain.com on port 50111: Unknown error 255.

Symptom


Examples of some other error messages and exceptions associated with HDFS unknown error 255:

Write of hdfs file failed: Unknown error 255.

Unable to connect to hdfs host 153.11.11.161 on port 8020: Unknown error 255.

SEVERE: PriviledgedActionException as:testdata
cause:java.io.IOException: Call to myhost12.mydomain.com/153.11.22.150:50111 failed on local
exception: java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129)
at org.apache.hadoop.ipc.Client.call(Client.java:1097)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:325)
...
Caused by: java.io.EOFException

Exception in thread "main" java.io.IOException: Could not get block locations. Source file "/user/testdata/testfile.dat" - Aborting...
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200

DataStreamer Exception: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:29)


Note that when an exception is passed back to DataStage job from HDFS library routines, the each line of exception stay trace may be logged as a separate DataStage event message, so a single exception as shown above may be spread out over many messages in the log file rather than appearing exactly as above.

Resolving The Problem


The more common causes of HDFS unknown error 255 are:

  1. The datanode and namenode are not known to the network domain name server (DNS server) and also not defined in /etc/hosts file on the DataStage engine tier machine. In this case, adding those names and the corresponding address to /etc/hosts file may resolve the problem.
  2. An error in hadoop / HDFS file system occurred. Use the following commands to determine if there are file system problems or files with missing blocks.
    hadoop fsck / > hdfs_fsck_report.log
    grep -i "miss" hdfs_fsck_report.log
    to find files with missing blocks.
  3. A version incompatibility exists between the hadoop libraries on source and target system, or difference in the level of hadoop libraries that the calling program was compiled with (either BigInsights or DataStage). Check the level of the hadoop client libraries on the DataStage engine machine are the version/release as the hadoop libraries on the remote hadoop / hdfs file system.
  4. Configuration files on the hadoop system such as core-site.xml may contain incorrect (or unresolvable) hostname. Check to see if core-site.xml was setup to serve to 127.0.0.1 (localhost) instead of 0.0.0.0. An EOF exception can occur due to the port not accessible externally for the defined hostname..
  5. Port number in core-site.xml is wrong or does not match the port that job is trying to connect with. Update job port number or xml file port number, depending upon which is incorrect.

[{"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.1;8.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21632839