Don’t mind the date on this post, this actually happened many months ago. But while going through things to makeup all my blog posts, I came across this..

So I went through HELL with my vCenter instance. Let’s talk about that for a moment. Why is it so hard JUST to change the DNS servers the appliance uses (from the Web UI, after this I’ll never use the UI again for this)? You have to make the change, then the appliance has to restart everything. Why? Are we back in the Windows 95 days before “plug and play” when you had to restart to do almost anything? This is just changing what the resolver uses. But I digress.

The reason for the change is, I was adding a new server to my cluster and vCenter wasn’t resolving it. I manually added the server to my hosts file, but shouldn’t need to do this. Did a little digging and found that the DNS server vCenter was wrong. Made the change, vCenter gets stuck at like 85%. I am still able to access the vCenter UI though. So I find a article, and make the biggest mistake of my life (not really, but ..). I didn’t check my backups. Had I checked, I would have found that something broke back in OCTOBER So my most recent backup was from October. Lots have changed. No more OSA vSAN. Now it’s ESA. CPU upgrades, etc. So needless to say, my vCenter was hosed.

I tried so many things to get it back. I attempted to restore an old backup, but this did not work. Reinstalling meant rebuilding everything, my cluster, vDS, networking, etc. I didn’t wanna deal with this either. So off to the original instance (well, I cloned it and working on the clone). I followed KB, which told me to remove everything in /storage/applmgmt/backup_restore/. Boy, was this a mistake. Despite what the KB said, this caused more problems and lead to hours more of trouble shooting.

This including me having to ssh into the appliance, run service-control parallel-restart (had to be parallel-restart or could not access the UI), etc. Finally, going back to another cloned instance, I actually look at the file /storage/applmgmt/backup_restore/restoreReconciliation-history.json.

Wait, this look promising. Maybe I can edit this file. Specifically, the section:

# more restoreReconciliation-history.json 
{
    "startTS": "2024-03-21T21:02:47.596Z",
    "endTS": "2024-03-21T21:05:25.867Z",
    "state": "FAILED",
    "jobType": "MANUAL",
    "operation": "RECONCILIATION",
    "progress": 70,
    "message": [
        {
            "id": "com.vmware.applmgmt.reconciliation.general_error",
            "defaultMessage": "An error occurred during reconciliation operation. See logs for details. https://vcenter.ghetto-homelab.com/appliance/support-bundle",
            "args": [
                "vcenter.ghetto-homelab.com"
            ]
        }
    ],
    "location": {},
    "locationUser": "",
    "backupSize": 0,
    "duration": 158,
    "id": "20240321-210246-22385739",
    "reconciliationNeeded": {},
    "product": "VC-8.0.2",
    "version": "8.0.2.00000",
    "build": "22385739",
    "fastBackup": {}
}

Ah hah! Let’s play. Let’s make a few changes and reboot, see what happens:

root@vcenter [ /storage/applmgmt/backup_restore ]# more restoreReconciliation-history.json 
{
    "startTS": "2024-03-21T21:02:47.596Z",
    "endTS": "2024-03-21T21:05:25.867Z",
    "state": "SUCCEEDED",
    "jobType": "MANUAL",
    "operation": "RECONCILIATION",
    "progress": 100,
    "message": [],
    "location": {},
    "locationUser": "",
    "backupSize": 0,
    "duration": 158,
    "id": "20240321-210246-22385739",
    "reconciliationNeeded": {},
    "product": "VC-8.0.2",
    "version": "8.0.2.00000",
    "build": "22385739",
    "fastBackup": {}
}

Then, after a reboot (I know I could have restart via service-control --start --all but I wanted to be sure, but everything was fine. But this would not be the last fight I’ve had with vCenter in the coming months. No no no. This is part of the reason that I’m highly considering migrating my compute cluster to Nutanix.

One thought on “Battles with vCenter

Leave a Reply

Your email address will not be published. Required fields are marked *