Before we get started, a little info about this post
At a high level, I need to install five (5) PCIe NVMe SSDs into a homelab server. In this post I cover how CPU & motherboard all play a role in how & where these PCIe cards can and should be connected. I learned that simply having slots on the motherboard doesn’t mean they’re all capable of the same things. My research was eye-opening and really helped me understand the underlying architecture of the CPU, chipset, and manufacturer-specific motherboard connectivity. It’s a lot to digest at first, but I hope this provides some insight for others to learn from. Before I forget, the info below applies to server motherboards, too, and plays a key role in dual socket boards when only a single CPU is used.
Sometimes the hardest part of any daunting task is simply starting. I got some help from Intel here, though.
My intention was to have additional mandatory parameters based on additional switches. For instance, if you add “-createvNet”, the script needs four additional parameters. Also, if you used “-EnableVMInternet” without “-createvNET”, the script will also need to recognize that wasn’t supplied and make the parameters with it mandatory. Spoiler: that didn’t work.
I told one of my nodes to enter maintenance mode and it sat for overnight like this:
That screenshot was taken almost exactly 26 hours later. There were no running VMs on the host, nothing on the local datastore, no resyncing or rebuilding objects in vSAN, and lastly nearly zero IO on the network adapters.
I tried canceling the task, it would not cancel.
I rebooted the host, it came back into the cluster with that task still running.
I rebooted my vCenter, and that finally killed the task.
Today I am midway through setting up my lab and realized the reason VMware Cloud Foundation (VCF) is failing is because I set the wrong password in my JSON file for the root account on my vCenter appliance.
No big deal, right? Just SSH in and change it. I tried, and got this:
BAD PASSWORD:it isbased onadictionary word
passwd:Authentication token manipulation error
The bypass was actually easy. Presumably you’re already SSH’d in as root, so you just need to edit /etc/pam.d/system-password
# Begin /etc/pam.d/system-password
# use sha512 hash for encryption, use shadow, and try to use any previously
# defined authentication token (chosen password) set by any prior module
I just built a new environment and was greeted by this error. This fix will likely work on other Dell servers, and the settings may apply to other vendors.
High level is you need to set TPM2 Algorithm Selection to SHA256 in the BIOS. You MIGHT have to turn on Intel TXT, and then enable Secure Boot. This SHOULD NOT impact the ESXi installation, but there is a chance it might. Enabling Secure Boot on a machine with modified or unsigned files carries with it the risk of rendering your machine unbootable with the current ESXi installation.
I’m blogging about this because I always seem to forget where to find the status of the Tier-0 Logical Router, basically which edge transport node is Active and which is Standby for that specific Tier-0 Gateway. It’s easy once I remember, but hitting the search engines doesn’t show anything useful, so I’ll try to keyword spam this to get more visibility for the next time I forget.
TL;DR: Switch to Manager mode. Click the Networking tab, Tier-0 Logical Routers, select the T0 you want. Look under High Availability Mode (screenshot below)
I’ve been intending to deploy NSX-T 2.4 since it’s release a few months ago to check out what’s new.
With that, I learned a little about a repeatable workflow to deploy it in a relatively easy way.
Let’s get started
This assumes you already have your vCenter deployed with a vSphere cluster and port groups set up. For NSX-T 2.4 (-T hereafter), you don’t have separate controllers from your manager, you can deploy a single manager and then add additional managers to make it a cluster. You’ll want 1 or 3 NSX Managers, depending if this is a lab, testing, or production; and if it’s a cluster, you’ll likely want an additional IP to serve as the cluster VIP. If you’re keeping count, that’s four (4) IPs, which is how I’m going to deploy it.
VMware has exploded into Software Defined Networking (SDN) with NSX, it’s no secret why it’s their fastest growing product, either. Through the use of all the components within NSX, you can be well on your way to a fully Software Defined Datacenter (SDDC) accomplishing things like automated deployments of networks, edge devices, NAT rules, firewall rules, and the list goes on.
Over the last year, we’ve been doing a lot of testing with VMware Cloud on AWS (VMC) and it’s pretty slick. In the past, we’ve used our physical parameter device (Cisco ASA) to handle the VPN traffic, but yesterday I wanted to set up a VPN to the management gateway, and I wanted it done now. Since I don’t have direct access to the ASA, I have to submit a ticket to our NetSec team to have them do it, and they have their own work going on, so naturally I decided to use an NSX Edge for this.
I pulled up the two interfaces side by side so I could fill out both at the same time, but I noticed the VMC side was missing a few things that I had on the NSX side: Local ID & Peer ID. But the VMC side also had an option for IKE & SHA versions, which I didn’t have on the NSX side. Keep those in mind as you step through this, let’s get started…
I got a text message this evening from a colleague of mine (@FrankRax) stating our lab was down. I tried to hit the vCenter and the hosts & clusters view wouldn’t load in the web client, just left me with the spinning wheel:
Okay, that’s fine, so I’ll check the VAMI, or Management UI of the VCSA, but then I got really scared when I saw this:
This isn’t a fresh install, it’s been a lab for a long time, actually even upgraded to 6.5u1 not that long ago. Now I know for a fact something’s gone wrong, so I launched the host client on each node in the cluster until I found the vCenter Server Appliance VM and launched the console, and was pretty much horrified at what I saw the following content may be disturbing to some audiences, viewer discretion is advised