I am working in a project including vCloud Director as well as most other parts of VMware’s cloud stack for a while now. Until a couple of days ago, everything was running fine regarding the deployment process of vApps from vCloud Director UI or through vCenter Orchestrator. Now we noticed that starting and stopping vApps takes way too long: Powering on a single VM vApp directly connected to an external network takes three steps in vCenter:
The first step of reconfigure virtual machine showed up in vCenter right after we triggered the vApp power on in vCloud Director. From then it took around 5min to reach step two. Once step 2 was completed, the stack paused for another 10min before the VM was actually powered on. This even seemed to have implications on vCenter Orchestrator including timeouts and failed workflows.
We spent an entire day on trying to track the problem down and came up with the opinion that it had to be inside vCloud Director. But before we went into log files, message queues etc, we decided to simply reboot the entire stack: BINGO! After the reboot the problem vanished.
Shutdown Process:
Then boot the stack in reverse order and watch vCloud Director powering on VMs withing seconds 😉