Before we get started, a little info about this post
At a high level, I need to install five (5) PCIe NVMe SSDs into a homelab server. In this post I cover how CPU & motherboard all play a role in how & where these PCIe cards can and should be connected. I learned that simply having slots on the motherboard doesn’t mean they’re all capable of the same things. My research was eye-opening and really helped me understand the underlying architecture of the CPU, chipset, and manufacturer-specific motherboard connectivity. It’s a lot to digest at first, but I hope this provides some insight for others to learn from. Before I forget, the info below applies to server motherboards, too, and plays a key role in dual socket boards when only a single CPU is used.
Sometimes the hardest part of any daunting task is simply starting. I got some help from Intel here, though.
I just built a new environment and was greeted by this error. This fix will likely work on other Dell servers, and the settings may apply to other vendors.
High level is you need to set TPM2 Algorithm Selection to SHA256 in the BIOS. You MIGHT have to turn on Intel TXT, and then enable Secure Boot. This SHOULD NOT impact the ESXi installation, but there is a chance it might. Enabling Secure Boot on a machine with modified or unsigned files carries with it the risk of rendering your machine unbootable with the current ESXi installation.
I’m working on importing drivers for Dell’s new 12G servers into our SCCM server for OSD. I got everything imported yesterday, added them to my boot image, created a new boot iso for use in non-PXE enabled networks, and went home for the day.
I get to work today and boot from my ISO I created yesterday and am greeted with the error 80004005, and some nondescript text stating it couldn’t pull a list of tasks. You know, the typical error that you have no idea what it actually means.
I googled it and found 80004005 is “Failed to get client identity”, and some pointed out the time being off may be the cause. I rebooted, BIOS time was maybe 30s off, so I tried again, but exported the smsts.log located in X:\windows\temp\smstslog\ via net use to my workstation. I opened that in SMS Trace, and here’s what I found:
Right there in RED is my error, plain as day, but what wasn’t shown to me in WinPE was the “signature varification failed”. I think it’s worthwhile to note Microsoft misspelled vErification, yup, that’s an A in theirs.
Now, if you google that, I found This Post stating they saw the error after moving their SCCM server to new hardware. We didn’t move to new hardware, we actually went from hardware to virtual, in that we P2V’d our SCCM server last night, which indeed changed the signature of the server.
I updated the boot image’s distribution point, which rebuilds it, then did a refresh for posterity. When that was 100% complete, I recreated the task sequence media boot ISO and all is well again.
A few years ago, we were one of the first/early adopters of UCS. At that time, it was clearly in it’s infancy and not ready for prime time, our local Cisco guys didn’t even know anything about it. If you care to read those previous posts, they can be found here: Part 1, Part 2, and Part 3. I was fairly bitter when I wrote those, but with good reason. I ‘wasted’ a lot of time (read weeks or months) jacking with it and had nothing but problems.
This is an update to my original get-WWN script using Get-View. Get-VMHostHba was pointed out to me by Robert van den Nieuwendijk, vExpert 2012, so I wanted to provide an update to my original post HERE. I attached the ps1 file at the end.
With the addition of get-vmhosthba in PowerCLI, you can get this information somewhat easier. At line 46
I’m sure many of you have dealt with trying to figure out how much RAM you can shove in a box, say an R720, and still keep RAM speeds up. I actually had some docs from Dell, figures, diagrams, graphs, and a few charts. Even then, it was difficult.
Enter the “Dell 12G Memory Solution Tool”. It is a website that allows you to test RAM & CPU configurations to get optimal speeds. For instance, you can select the R720, 2 CPUs, and that you want 256GB of RAM. That’s a nicely sized box for virtualization. The tool tells me I can get 16x 16GB of 2R4 DIMMS at 1333MHz or 1600MHz. Of course, I’m going to go with the 1600MHz! What if I want to bump the RAM? I checked out 384GB & 512GB to see how they stack up; 384GB gives me the option for 24 16GB DIMMS, but drops my speed to 1066; and 512GB has two options, either 800MHz, or 1333MHz (yes please!).
It also shows you some quick price & power consumption rankings on a 1-5 scale.
When building a new cluster, your storage team (or you) may need to add several hosts into the shared storage zone. It’s a pain to go to each host, configuration, storage adapters, then copy out the WWN.
With this script, you can supply a vCenter server and Cluster/Folder/Datacenter (any logical container) and it will list all the WWNs for Fibre Channel devices. But what if you don’t have vCenter stood up yet? No problem, you can also supply a list of ESX/ESXi hosts to scan.
Shawn & I built this because we have 20 hosts we need the WWNs from to provide to our storage team, and vCenter isn’t alive yet.
We all hate that adding DDR3 sticks to a server slows down the QPI speed (or RAM Bus for lack of a better example).
That changes with the Nehalem EX proc (and perhaps Westmere), as the CPU governs the speed. You can throw up to 16 sticks of DDR3 RAM per CPU at either 800, 978, or 1066MHz, and the governing factor is the CPU:
When we’re ready to deploy new ESXi hosts in our environment, we order them from Dell with ESXi pre-loaded on the internal SD-Card. This is nice and all, but what do you do when you have to go through and configure NTP, Users, Groups, Scratch directory, lockdown mode, and the list goes on?
You’d have to fire up each server, go through and configure everything, x10 if you had 10 new servers.
Since we’re working on a new rather larger virtualization deployment, we were looking at ways to overcome this.
Well, Cisco finally came with an answer to why I was able to break the stuff like clock work before, and that answer was firmware. A new firmware has been release for the chassis, blades, & FEX (and I’m sure I’ve either got that in the wrong order or hardware), but I can’t say I’m excited about it.
We set more time aside to have Cisco come in and upgrade the bits, as if we haven’t wasted enough time already. This time, they sent the big guns to work on it, or gun, rather, as they sent one of the engineers named Troy. He was a good guy, very knowledgeable, but he can’t help it that he works for Cisco, we’ve all gotta eat, right?