Nutanix: VM based Foundation stuck at 67%

Foundation process using the Foundation VM 3.5 got stuck at 67% for more than an hour for one node. The other two nodes completed successfully. There were no update in the logs available at /home/nutanix/foundation/log/ in the Foundation VM for as long as it got stuck. The log for the problem node stood still here.

20170131 03:53:59 INFO Installation of Acropolis base software successful: Installation successful.
20170131 03:53:59 INFO Rebooting node. This may take several minutes: Rebooting node. This may take several minutes
20170131 03:53:59 INFO INFO: Rebooting node. This may take several minutes

From IPMI, I saw the node rebooted and hypervisor had been installed but not the CVM. I rebooted it again which did not change anything.

As there was nothing I could do, I killed the foundation process and restarted it which fixed the issue.

I noted the full command line for the foundation process

[nutanix@nutanix-installer 20170131-044651-2]$ ps aux | grep foundation
nutanix   7880  0.1  9.5 1149348 182760 ?      Sl   04:45   1:08 /usr/bin/python -Bu /home/nutanix/foundation/bin/foundation --service
nutanix  10731  0.0  0.0 103248   848 pts/1    S+   21:36   0:00 grep foundation

Killed it

sudo pkill -9 foundation

Started it by running

/usr/bin/python -Bu /home/nutanix/foundation/bin/foundation --service

Update: I learned there’s actually a script in the foundation VM that can restart the foundation service.

[nutanix@nutanix-installer ~]$ ./foundation/bin/foundation_service
self: ./foundation/bin/foundation_service
Usage: ./foundation/bin/foundation_service {start|stop|restart|status}

I opened localhost:8000 on the browser and the information such as IP, mask provided earlier were still intact and I proceeded to the next steps. It straight away went to 100% for the two nodes completed earlier. And this time, the node that failed crossed 67% and completed successfully.


