Troubleshooting
Problem
After installing DSX Local, I cannot connect to DSX Admin
Symptom
I have just installed Build 271 of DSX Local.
When I try to connect to DSX Admin, I get an error:
?
[root@smartbmchost9 ~]# kubectl logs ibm-nginx-4254780576-dk8w6 -n ibm-private-cloud
starting up
setting up ssl
using default certs
copying the ml conf files ...
2017/07/16 13:10:16 [emerg] 12#12: host not found in upstream "usermgmt-svc.sysibm-adm.svc.cluster.local" in /usr/local/openresty/nginx/conf/nginx.conf:73
nginx: [emerg] host not found in upstream "usermgmt-svc.sysibm-adm.svc.cluster.local" in /usr/local/openresty/nginx/conf/nginx.conf:73
nginx seems to have stopped, exiting..
[root@smartbmchost9 ~]#
If I connect to the Web interface, I get a timeout (blank page).setting up ssl
using default certs
copying the ml conf files ...
2017/07/16 13:10:16 [emerg] 12#12: host not found in upstream "usermgmt-svc.sysibm-adm.svc.cluster.local" in /usr/local/openresty/nginx/conf/nginx.conf:73
nginx: [emerg] host not found in upstream "usermgmt-svc.sysibm-adm.svc.cluster.local" in /usr/local/openresty/nginx/conf/nginx.conf:73
nginx seems to have stopped, exiting..
[root@smartbmchost9 ~]#
I checked that everything was running:
kubectl get po --all-namespaces | grep usermgm
?
[root@smartbmchost9 k8s]# kubectl get po --all-namespaces | grep usermgm
sysibm-adm????????? usermgmt-830597880-42pmb????????????????????????????? 1/1?????? Running??????????? 0????????? 10h
sysibm-adm????????? usermgmt-830597880-mwg9t????????????????????????????? 1/1?????? Running??????????? 0????????? 10h
sysibm-adm????????? usermgmt-830597880-42pmb????????????????????????????? 1/1?????? Running??????????? 0????????? 10h
sysibm-adm????????? usermgmt-830597880-mwg9t????????????????????????????? 1/1?????? Running??????????? 0????????? 10h
If I run:
kubectl get po --all-namespaces | grep ibm-nginx
I get:
[root@smartbmchost9 k8s]# kubectl get po --all-namespaces | grep ibm-nginx
ibm-private-cloud?? ibm-nginx-3111505050-1x4kv??????????????????????????? 0/1?????? CrashLoopBackOff?? 129??????? 10h
ibm-private-cloud?? ibm-nginx-3111505050-5pphk??????????????????????????? 0/1?????? CrashLoopBackOff?? 129??????? 10h
ibm-private-cloud?? ibm-nginx-3111505050-t8bvl??????????????????????????? 0/1?????? CrashLoopBackOff?? 129??????? 10h
[root@smartbmchost9 k8s]#
So the pods have crashed. How to fix it?
Cause
There is a domain resolution failure associated with the search option in /etc/resolv.conf
?
Environment
Red Hat Linux 7.2
Resolving The Problem
This is due to a DNS issue.
Edit the resolv.conf file on each node and comment out the search line:- open the file (/etc/resolv.conf) - with vi for instance
- comment out the line starting with 'search', e.g.
#search fisc.ibm.com
- stop and disable the Network Manager (so that it does not restore the search setting automatically), and start and enable networking
systemctl disable NetworkManager && systemctl stop NetworkManager && sysctemctl start network && systemctl enable network
Delete the pods that have crashed
kubectl get po --all-namespaces | grep ibm-nginx | awk '{system("kubectl delete pod -n="$1" "$2)}'
?
[root@smartbmchost9 k8s]# kubectl get po --all-namespaces | grep ibm-nginx | awk '{system("kubectl delete pod -n="$1" "$2)}'
pod "ibm-nginx-3111505050-1x4kv" deleted
pod "ibm-nginx-3111505050-5pphk" deleted
pod "ibm-nginx-3111505050-t8bvl" deleted
pod "ibm-nginx-3111505050-1x4kv" deleted
pod "ibm-nginx-3111505050-5pphk" deleted
pod "ibm-nginx-3111505050-t8bvl" deleted
Then if you look at the pods again:
?
[root@smartbmchost9 k8s]# kubectl get po --all-namespaces | grep ibm-nginx
ibm-private-cloud?? ibm-nginx-3111505050-481ns??????????????????????????? 1/1?????? Running?? 0????????? 5m
ibm-private-cloud?? ibm-nginx-3111505050-srv4s??????????????????????????? 1/1?????? Running?? 0????????? 5m
ibm-private-cloud?? ibm-nginx-3111505050-xqw9b??????????????????????????? 1/1?????? Running?? 0????????? 5m
ibm-private-cloud?? ibm-nginx-3111505050-481ns??????????????????????????? 1/1?????? Running?? 0????????? 5m
ibm-private-cloud?? ibm-nginx-3111505050-srv4s??????????????????????????? 1/1?????? Running?? 0????????? 5m
ibm-private-cloud?? ibm-nginx-3111505050-xqw9b??????????????????????????? 1/1?????? Running?? 0????????? 5m
If deleting the nginx pods does not resolve the issue:
Try deleting the usermgmt pods:[root@smartbmchost9 k8s]# kubectl get po --all-namespaces | grep usermgmt | awk '{system("kubectl delete pod -n="$1" "$2)}'
pod "usermgmt-830597880-42pmb" deleted
pod "usermgmt-830597880-mwg9t" deleted
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSKKD2","label":"IBM Data Science Experience Local"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
Modified date:
07 December 2018
UID
ibm10745907