CES HDFS troubleshooting
This topic contains information on troubleshooting the CES HDFS issues. CES HDFS consists of CES and HDFS Transparency functionality.
For more information on troubleshooting HDFS Transparency, see Second generation HDFS Transparency Protocol troubleshooting.
- Debug, trace, and logs.
Solution:
To check the state of the CES HDFS cluster, see the mmhealth command documentation in IBM Storage Scale: Command and Programming Reference Guide guide.
To determine the status of the CES HDFS NameNodes state, run the following command:/usr/lpp/mmfs/hadoop/bin/hdfs haadmin -checkHealth -scale -all
For more information, see the hdfs haadmin command.
For HDFS Transparency, see Second generation HDFS Transparency Protocol troubleshooting on how to Enable Debugging.
- CES HDFS Transparency cluster failed to
start.
mmces service enable HDFS or mmces service start hdfs -a
Solution:Note: Run/usr/lpp/mmfs/hadoop/bin/hdfs namenode -initializeSharedEdits
, if the NameNode failed to start with the following exception:2019-11-22 01:02:01,925 ERROR namenode.FSNamesystem (FSNamesystem.java:<init>(911)) - GPFSNamesystem initialization failed. java.io.IOException: Invalid configuration: a shared edits dir must not be specified if HA is not enabled. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:789) at org.apache.hadoop.hdfs.server.namenode.GPFSNamesystemBase.<init>(GPFSNamesystemBase.java:49) at org.apache.hadoop.hdfs.server.namenode.GPFSNamesystem.<init>(GPFSNamesystem.java:74) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:706) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:669) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:731) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:968) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:947) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1680) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1747)
- Mapreduce container job exit with return code
1.
Solution:
If Container exited with a non-zero exit code 1. Error file: prelaunch.err occurred while running the Mapreduce workloads, add the following property into the mapred-site.xml to resolve the issue:<property> <name>mapreduce.application.classpath</name> <value>/usr/hadoop-3.1.2/share/hadoop/mapreduce/*, /usr/hadoop-3.1.2/share/hadoop/mapreduce/lib/*</value> </property>
- mmhdfs hdfs status shows node is not a DataNode.The command mmhdfs hdfs status shows the following errors:
c16f1n13.gpfs.net: This node is not a datanode mmdsh: c16f1n13.gpfs.net remote shell process had return code 1.
Solution:
Remove the localhost value from the host.
On the worker node, run:mmhdfs worker remove localhost
- All the NameNodes status shows standby after mmhdfs start/stop/restart
commands.
Solution:
Use the mmces service command to start/stop NameNodes so that the proper state is reflected for the NameNodes.
If the mmhdfs start/stop/restart command was executed against the NameNodes, run the mmces service start/stop hdfs to fix the issue.
- hdfs dfs -ls or another operation fails with a StandbyException.Running the hdfs dfs -ls command fails with a StandbyException exception:
[root@scale12 transparency]# /usr/lpp/mmfs/hadoop/bin/hdfs dfs -ls /HDFS 2020-04-06 16:26:25,891 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2010) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1447) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3129) at org.apache.hadoop.hdfs.server.namenode.GPFSNamesystem.getFileInfo(GPFSNamesystem.java:494) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1143) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:939) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) , while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over scale12/192.0.2.21:8020 after 1 failover attempts. Trying to failover after sleeping for 1157ms. ^C2020-04-06 16:26:27,097 INFO retry.RetryInvocationHandler: java.io.IOException: The client is stopped, while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over scale11/192.0.2.20:8020 after 2 failover attempts. Trying to failover after sleeping for 2591ms.
Both the NameNodes are in standby and the CES has failed to select one as active. To verify, run the following command:
scale01:8020 standby scale02:8020 standby/usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
Solution:
- Check the NameNode that should be active by running the following command:
/usr/lpp/mmfs/bin/mmhealth node show -v HDFS_NAMENODE -N cesNodes
- For one of the nodes, the output shows the hdfs_namenode_wrong_state event.
- ssh to that node and set it manually to active by running the following
command:
/usr/lpp/mmfs/hadoop/bin/hdfs haadmin -transitionToActive -scale
- Wait for 30 seconds and verify if the NameNode is now active by running the following commands:
/usr/lpp/mmfs/hadoop/bin/hdfs haadmin -getAllServiceState
and
/usr/lpp/mmfs/bin/mmhealth node show -v HDFS_NAMENODE -N cesNodes
- Check the NameNode that should be active by running the following command:
- CES HDFS Transparency fails to start if the Java™ version is upgraded.
Solution
For information on troubleshooting this issue, see HDFS Transparency fails to start if the Java version is upgraded.
- The mmhdfs command cannot recognize the FQDN
hostnames if the NameNodes or DataNodes were added with short hostname.
If IBM Storage Scale and HDFS Transparency are set up with short hostname then there is no issue with using a short hostname.
If IBM Storage Scale is set up with FQDN and HDFS Transparency is set up with short hostname then mmhdfs does not recognize the node as a NameNode or DataNode.
For example, the mmhdfs hdfs status command will state that this is not a NameNode and will exit with a return code 1.
Solution:
Set up Transparency to use FQDN by updating the hdfs-site.xml to set the NameNodes to FQDN and the worker file hostnames to FQDN.
- Multi-HDFS cluster deployment through IBM
Storage Scale 5.1.1.0 installation toolkit is not supported.
Solution:
If you want to create multi-hdfs clusters on the same IBM Storage Scale, perform the following:- Clear the installation toolkit HDFS metadata, by running the following command:
/spectrumscale config hdfs clear
- Follow Adding a new HDFS cluster into existing HDFS cluster on the same GPFS cluster using install toolkit. Note: Ensure that the creation of the new HDFS fields are unique from already existing HDFS cluster. The installation toolkit will not be able to check if there are duplicate values. The installation toolkit HDFS metadata will be regenerated after the CES HDFS cluster is deployed but will only contain the new HDFS cluster information.
- Clear the installation toolkit HDFS metadata, by running the following command:
- mmhealth node shows CES in Degraded
state.When you are creating a CES HDFS cluster, mmhealth node shows CES -v as degraded and with hdfs_namenode_wrong_state message.
[root@scale-31 ~]# mmhealth node show CES -v Node name: scale-31.openstacklocal Component Status Status Change Reasons ------------------------------------------------------------------------------------------------------------- CES DEGRADED 2021-05-05 09:52:29 hdfs_namenode_wrong_state(hdfscluster3) AUTH DISABLED 2021-05-05 09:49:28 - AUTH_OBJ DISABLED 2021-05-05 09:49:28 - BLOCK DISABLED 2021-05-05 09:49:27 - CESNETWORK HEALTHY 2021-05-05 09:49:58 - eth1 HEALTHY 2021-05-05 09:49:44 - HDFS_NAMENODE DEGRADED 2021-05-05 09:52:29 hdfs_namenode_wrong_state(hdfscluster3) NFS DISABLED 2021-05-05 09:49:25 - OBJECT DISABLED 2021-05-05 09:49:28 - SMB DISABLED 2021-05-05 09:49:26 - [root@scale-31 ~]# mmhealth event show hdfs_namenode_wrong_state Event Name: hdfs_namenode_wrong_state Event ID: 998178 Description: The HDFS NameNode service state is not as expected (e.g. is in STANDBY but is supposed to be ACTIVE or vice versa) Cause: The command /usr/lpp/mmfs/hadoop/sbin/mmhdfs monitor checkHealth -Y returned serviceState which does not match the expected state when looking at the assigned ces IP attributes User Action: N/A Severity: WARNING State: DEGRADED [root@scale-31 ~]# hdfs haadmin -getAllServiceState scale-31.openstacklocal:8020 active scale-32.openstacklocal:8020 standby [root@scale-31 ~]# [root@scale-31 ~]# mmces address list Address Node Ces Group Attributes ----------- ----------------------- ------------------ ------------------ 192.0.2.0 scale-32.openstacklocal hdfshdfscluster3 hdfshdfscluster3 192.0.2.1 scale-32.openstacklocal none none 192.0.2.2 scale-32.openstacklocal none none 192.0.2.3 scale-31.openstacklocal none none 192.0.2.4 scale-31.openstacklocal none none 192.0.2.5 scale-31.openstacklocal none none [root@scale-31 ~]#
The issue here is that the CES IP is assigned to the Standby NameNode instead of the Active NameNode.
Solution:
The following are the three solutions for this problem:- Manually set the active NameNode to standby on the node by running the
/usr/lpp/mmfs/hadoop/bin/hdfs haadmin -transitionToStandby -scale
command. Then on the other node, set the standby NameNode to active by running the/usr/lpp/mmfs/hadoop/bin/hdfs haadmin -transitionToActive -scale
command. - Move the CES IP to the active NameNode by running the
mmces address move --ces-ip <CES IP> --ces-node <node name>
command. - Restart the CES HDFS NameNodes by running the following commands:
mmces service stop HDFS -a mmces service start HDFS -a
- Manually set the active NameNode to standby on the node by running the
- Kerberos principal update not taking effect on changing KINIT_PRINCIPAL in
hadoop-env.sh.
Solution:
The CES HDFS Kerberos information is cached at /var/mmfs/tmp/krb5cc_ces. Delete this file to force the update.
- If Kerberos was configured on multiple HDFS Transparency clusters using a common KDC
server and the supplied gpfs_kerberos_configuration.py script,
kinit
with the hdfs user principal fails for all the clusters except the most recent one.The kerberos configuration script gpfs_kerberos_configuration.py, generates a keytab fie for the hdfs user under the /etc/security/keytabs/hdfs.headless.keytab default path. The kinit error occurs because the gpfs_kerberos_configuration.py script updated the keytab file and invalidated the copies of the keytab on the previous cluster.
Solution:
From the most recent HDFS Transparency cluster that the script was run, copy the keytab file to all the other HDFS Transparency cluster nodes where the script was run.
For example:
If Hadoop cluster A ran the gpfs_kerberos_configuration.py script which created the hdfs user principal and Hadoop cluster B ran the gpfs_kerberos_configuration.py script which then updated the original hdfs user keytab, copy the hdfs keytab from Hadoop cluster B to Hadoop cluster A to ensure that the Hadoop cluster A kinit works properly.
This limitation has been fixed in HDFS Transparency 3.1.1.6.
- DataNodes are down after system reboot.
Solution:
HDFS Transparency DataNodes may not start automatically after a system reboot. As a workaround, you can manually start the DataNodes after the system reboot by using the following command from one of the CES nodes as root:#/usr/lpp/mmfs/hadoop/sbin/mmhdfs hdfs-dn start
- HDFS administrative commands, such as hdfs haadmin and hdfs
groups cannot be executed from HDFS clients where Kerberos is enabled. The HDFS client
ensures that the CES-HDFS user principle has the CES-HOST name instead of the NameNode hostname. The
administrative commands fail with the following
error:
Caused by: java.lang.IllegalArgumentException: Server has invalid Kerberos principal: nn/c88f2u33.pokprv.stglabs.ibm.com@HADOOP.COM, expecting: nn/c88f2u31b.pokprv.stglabs.ibm.com@HADOOP.COM at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:337) at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:234) at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:160) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:390) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622) at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818) ... 15 more
To resolve this, we have to add the following key in the core-site.xml file on the client:hadoop.security.service.user.name.key.pattern=*
While using Cloudera Manager:- Go to Clusters > IBM Spectrum Scale > Configuration > Cluster-wide Advanced Configuration Snippet (Safety Valve) for the core-site.xml file.
- Add the hadoop.security.service.user.name.key.pattern=* parameter and restart related services.