It’s been awhile since I’ve posted anything. This time of year is hard for me, and I’ve basically just been maintaining. I’m doing the minimal to keep my services, and those of my families running. This means basically nothing — everything basically runs without much intervention now. Have status monitors, health checks, etc to alert if something is amiss. But I do have some plans for 2025.
- Re-organize and clean up cabinet. I’ve even purchased some cable management arms, new patch cables, etc for this. Just have to get motiviated, and schedule some downtime to do it. Probably sometime later in January. (Since I started this on 1/1/2025 but here we are 1/18 and not done, the clean-up is scheduled for Sunday, 1/19, starting bright and early at 8am EST)
- Setup an Horizon deployment.
- Setup VCF Holodeck to get the full VCF Experience.
- Create the ultimate Smart Home, fully integrating the new Home Assistant Voice Preview devices.
- AI Training – going to setup Frigate with a Coral TPU for image training and processing, plus setup a LLM that access my paperless-ngx, and other information sources.
- Build a new backup infrastructure…
Let me stop here. Like I wrote earlier, I have been doing the minimal and depending on alerts/notifications/etc to warn of any issues. I had gotten a notification that my MACHINE_CERT certificate expired in vCenter. So I had to fix that — generated new certificates, verified all hosts could still talk to each other, that vSAN worked, and went on my merry way. This happened on 1/9/2025.
Today, as I am getting prepared for the clean-up tomorrow, I wanted to test a Instant Recovery of a VM, and when I log into my Veeam Backup Enterprise Manager, I see that I have 4 FAILED backup jobs. Panic begins to creep in. First, I go check and I have not had a successful backup since 1/9/2025. Let’s manually trigger a backup job, and at the same time let’s go check the backups that are not managed by Backup Enterprise Manager as they are not compatible (VMs that have PCI passthrough devices have to use the Veeam agent, and the agent controls the backups). These all were fine.
Back to BEM — see this job too has failed. Start investigating, and see errors:
Damn it! I did not think about Veeam when I was correcting my certificate issue in vCenter. Easy fix, but the bigger issue is — why did none of my monitoring alert me that the backups failed? I have tested these notifications before and they worked. What happened now? Well — bonehead mistake on my part. A few months ago, I created a new backup VM. I needed to get off Windows Server 2019, and figured a clean install would be best. Plus, I wanted to make a few other domain-level changes as well. This also meant a clean install of Veeam. Then, due to the changes in ReFS, my original backup repo was initially not available to the new install, so I created a new repo, setup the backup job, and stopped all jobs on the original server. No jobs, no monitoring, no need for notifications.
After I got the repo’s all situated and merged, and shutdown the old backup server, I never enabled notifications. But, this was an easy fix. Just re-authenticate your user credentials within Veeam, and problem solved. Fire off another manual run, and this time success. Tested notifications as well and setup more specific job monitoring.
- Build a new backup infrastructure, including off-site location and tape backup units.
- Get serious about writing/blogging my adventures.
- Kubernetics
- … and more..
I will take pictures all throughout the cleanup/reorg and once complete, and after I rest, I will write up something explaining how it went and the pitfalls I run into.