Nutanix, here I come!

I’m considering migrating to Nutanix. Why? Because of all the problems BC has caused with VMware, I do not see much of a future learning VMware products . We’re evaluating migrating our workloads at work to Nutanix, Proxmox, or xcp-ng so why not fully embrace Nutanix? (Why not Proxmox, xcp-ng, hyper-v, etc? No justifiable reason other than I do not like the Proxmox UI at all, and with xcp-ng, while you can compile their management platform yourself, for storage I’d have to use something like StarWinds vSAN and don’t want another platform to manage).

Nutanix checks almost all of the boxes. What are these boxes?

  • High Availability
  • DRS-like features
  • nVidia vGPU Support
  • Integrated Storage Management
  • Single Plane of Management
  • External Storage (iSCSI/FC)

Wait, with Nutanix, you cannot have external storage I thought? That’s the beauty of Nutanix, you can install either ESXi OR AHV for your hypervisor. Now I know what you’re thinking? I thought you wanted to move away from the VMware ecosystem? Well, I do. But I need a way to migrate with 0 downtime, as I do run production loads in my cluster. But to answer your question, this is the only box that Nutanix does not fully check. If running with ESXi, you can have external storage. But with AHV? Nope. There in lies my problem. Some of my workloads are actually run on storage connected via iSCSI. Production workloads that do not require high IOPs, will run just fine on spinners, etc. My other workloads though are on the vSAN ESA datastores (or some are on the NVMe iSCSI datastores).

Anyway. Another box that Nutanix doesn’t check is, NVM-oF, etc. I want to setup NVMeOF, NVMe over RDMA, etc. If I go full Nutanix, this is out unless I either 1) build a nested lab, or 2) invest in more hardware.

I will have to add more storage if I go Nutanix. Since I will no longer have access to the iSCSI shares, I need more space on the cluster. From everything I’ve read, Nutanix’s storage design is basically similar to RAID1, so I will have about half the usable space.

Migration Plan

I know I have not gotten to the why exactly, but I wanted to get this migration plan down. Note, this is a work in progress until I actually do it. But..

  • Migrate all VMs from vSAN datastore to NAS iSCSI Datastore. Why? Because Nutanix cannot read vSAN’s FS, and will need to format the drives.
  • Setup vDS on esxi01 and esxi02 mirroring the vDS on the Cisco UCS Cluster.
  • vMotion all VMs to esxi01 and esxi02. These two hosts will not be migrated to Nutanix. These run my TrueNAS VMs, vCenter, etc.
  • Shutdown esxi03-esxi06.
  • Verify all VMs are still running, no errors, nothing on the UCS Cluster, and nothing depends on anything on the UCS Cluster.
  • Install UCS-MSTOR with 128gb SSD drive on each server for boot.
  • IF I can get a version of Nutanix Foundation, then follow the instructions at Nutanix Cisco UCS Field Installation Guide and reset CIMC to default values, etc. This includes backing up each UCS Config just in case (I put a lot of work into getting my vNICs, getting my networking just how I want it).
    • I need to look into resetting CIMC to default values. I want to verify I cannot keep my existing vNIC config. It would be nice if I could keep the same vNICs and use them for similar purposes. I currently partition my 40Gb QSFP+ Port below. This presents to the operating system 5 separate network adapters and segments the traffic. This setup is replicated across both ports for redundancy.
      • vNIC0 – Management Traffic
      • vNIC2 – VM Traffic
      • vNIC3 – vSAN Network
      • vNIC4 – vMotion Network
      • vNIC5 – Storage Network
  • Again, IF I can get Foundation, then start the Installation. I will be installing to the new MSTOR devices. However, for the hypervisor, I will select ESXi. I will copy the Cisco Specific ISO to the Foundation VM. Why ESXi? So I can migrate my VMs back to the new cluster. Remember, Nutanix AHV does not support External Storage so in order to do a “live” migration, I have to do it this way. Foundation will then install ESXi on each node, configure the vSwitchs, vDistributed Switches, CVM VM, setup storage, and everything else. I will walk away and come back in about an hour and have a working Nutanix Cluster running ESXi.
  • If I cannot get Foundation, then I will have to manually install Nutanix on each host, then setup the cluster and start the cluster. ESXi will then be configured with Nutanix providing storage. But before I can migrate my VMs, I must fix my setup so my drives/HBA are passed through directly to the CVM VMs instead so that Nutanix AOX completely manages the storage.
  • Migrate all VMs to new Nutanix Cluster.
  • Test, test, and test. Make sure everything is working properly, that DRS works, HA works, the works. After we are sure we are working and stable,
  • We Migrate the cluster to AHV. the CVMs will take care of everything (hopefully). No clue how long this will take. I know in a rolling manner, each host will be placed in maintenance mode, then restarted and AHV installed, then the VM migration begins while the next node is installed.
  • After everything migrated, reboot everything to make sure everything comes up cleanly, then test test test. After verifying everything is working, re-enable monitoring and enjoy using Nutanix and managing with Prism!

Why Nutanix?

Why Nutanix? As I said, with all the problems BC has caused, more and more companies are going to start migrating to Nutanix. This is the perfect time to learn the product, and maybe get a few certifications along the way. I am going to follow some tutorials and setup my cluster like they would be setup in different industries. I’m going to install their version of NSX, do network segmentation, routing, etc. While VMware products will still be around, I fear they will go the way other BC purchases have gone. Neglected, then finally dumped.

Let me add to this. I made a post on reddit. I was asking about the HCL for the Cisco UCS Series. I got a response from the Director of Engineering, who I noticed is very active on Reddit. I like that a Director-level has time to not only READ posts on Reddit, but also respond and post. That settles it for me. I’ve been really debating, but this has settled it. Regardless of what happens at work, my time running ESXi at home is coming to an end. This is an end of an era. I’ve been using ESXi since version 5, still like the c# client. But no more. While I may keep my storage as ESXi (that is, my NAS), everything else will be Nutanix.

I hear some of you, why not migrate your storage cluster? Well, licensing. You can run up to a 4-node cluster with CE. My storage nodes do replicate to each other, but view my BOM. You will see that one node is only used as a replication target. The R720 is the 2 in my 3-2-1 Backup plans. So while I could set them up as a 2 node cluster (violating the CE license tho right?), it means I must keep the r720 powered on at all times, and then configure iSCSI and NFS shares from that pool. Not really worth it.

I’m still a few weeks off, I have to order some UCS-MSTOR-M2 or UCS-M2-HRAID and some M.2 drives for booting — I need the HBAs completly unused by the OS so they may be passed through to the CVM to provide storage. I’m ordering 4 so my servers are uniform — I’m trying to get everything almost uniform (c220 vs c240 is uniform in my book, as long as the same series). But one of my servers has 2x NVMe drives which can be passed through to the CVM and leave the HBA free for booting. I really have two choices. I can get the UCS-MSTOR and drives for each server, OR I can purchase the CBL-NVME-M5 which connects the NVMe drives to the motherboard. However, this cable alone is $150. Then I still need to purchase at least 6x NVMe drives. The price of this, versus the MSTOR devices is more than I want to spend. NVMe route is at a minimal, $1k. Going the route of UCS-MSTOR, and 240GB M.2 SATA is $572. Almost half the cost. I can invest that other 400 in upgrading my final m4 to m5. I can get the m5 chassis with a RAID controller. I have everything else I would need other than caddies. It would only be an Skylake — Going Cascade Lake would add another $600 minimal :(.

What cool things am I going to do?

Going to setup the network to mimic different environments key for the Nutanix Certified Associate, Nutanix Certified Professional – Unified Storage (NCP-US), Nutanix Certified Professional – Multicloud Infrastructure (NCP-MCI), etc.

I am going to continue running my production workloads — more on all that later.

Build out a VDI deployment. Was going to be Horizon. But now.. Maybe Nutanix’s VDI? Actually, could still be Horizon now that they’ve been spun off. You should read about my believes on that, and this entire VMware mess here (put link here).

Stay tuned…

Leave a Reply

Your email address will not be published. Required fields are marked *