Maintenance Mode task hangs
I told one of my nodes to enter maintenance mode and it sat for overnight like this:
That screenshot was taken almost exactly 26 hours later. There were no running VMs on the host, nothing on the local datastore, no resyncing or rebuilding objects in vSAN, and lastly nearly zero IO on the network adapters.
I tried canceling the task, it would not cancel.
I rebooted the host, it came back into the cluster with that task still running.
I rebooted my vCenter, and that finally killed the task.
But why?
When entering maintenance mode, I simply selected “Ensure accessibility” as I always do, then clicked OK.
Now that it’s back, I decided to follow the recommendation to run the data migration pre-check, as noted here:
So here’s the kicker
Selecting Ensure accessibility didn’t give me the smoking gun. It just said some objects would become “non-compliant” with the yellow triangle.
Now, when I selected “No data migration”, that really showed me the problem:
If you notice the storage policy is “k8s-fft-0”, it’s exactly what it says it is. I deployed Tanzu/TKG with a supervisor cluster, but then I created a new namespace where I basically created a vSphere pod, which does not support vMotion.
Since this was just a test, I deleted that Namespace in the Workload Management menu of the vSphere Client.
After that, the host entered Maintenance Mode just fine.