So, one of the huge advantages of Cisco UCS is its approach to “statelessness”. If you are not familiar with this concept, just know that anything that ties an operating system, hypervisor, or application to a specific piece of hardware is considered “stateful” and is not desirable in datacenter servers. Using this methodology, Cisco has made the upgrade path extremely easy for a customer to upgrade from one server model to the next without having to re-install anything. To be more specific, I upgraded various operating systems and hypervisors that were running in a service profile assigned to a B200 M2 and moved the profile to a brand new architecture of a B200 M3. The UCS portion of this migration is really (really) easy – you simply associate the profile from an M2 and assign it to an M3. The OS or hypervisor takes care of the rest. This article will cover the details of how this migration worked and what steps I took to make sure it was a success. Disclaimer: everything you’re about to read is totally unsupported by Cisco TAC. As a company, we have not tested nor certified this process. I am simply reporting here what I, myself, have tested and seen work. So don’t call Cisco if this doesn’t work. Feel free to leave a comment and I’ll look into it when I can find time.
I tested of few different installations of various hypervisors and OS’s throughout this process and hope to test more as I have time. These are my results thus far:
NOTE: All migration testing was done booting from XIO Emprise 5000 FC SAN using a Cisco VIC CNA.
VIC1240 (2.0.2q) – A 1280 should work identically as they are based on the same ASIC
I installed each of the below onto the M2 and then migrated it to the M3. I was mainly looking for the instance to still boot after migrating. I did not test every aspect of the installation once it was on the M3. For instance, if you added or removed vNICs or vHBAs and caused the PCI bus to re-enumerate, you may have some cleanup to do. I was simply verifying that you could actually get to the point of being able to clean up at all.
- VMware ESXi 4.1 Build 348481 migrated without issue. I was running Fnic version 126.96.36.199.2-4vmw (according to vmkload_mod –s fnic)
- VMware ESXi 5.0 Build 441354 migrated without issue. I was running Fnic version 188.8.131.52-1vmw (according to vmkload_mod –s fnic)
- Windows Server 2008 R2 migrated without issue in one test I ran, but had the dreaded STOP 0x7B in separate (different) attempt. It depends on some factors that we’ll cover in a minute. If you just keeping score, mark one more down for UCS, provided you installed Windows 2008 using Cisco provided Palo drivers version 2.0 or later. If you installed using a version of the drivers prior to 2.0, you will likely see the STOP 0x7B trap screen. The good news is you can still get it to work, but it requires some manual input on your part. This is the technical part so you can skip this paragraph if you are not in this situation.
There are certain types of devices that Windows needs to install drivers for prior to the time that the normal plug-n-play process is available. Disk Drive controllers would fall into this category (as do the various bridges that sometimes lead to disk controllers). Because Windows is bypassing the plug-n-play manager, it directly starts these drivers. The portion of the registry that stores all of this is known as the Critical Device Database (CDDB). You can locate it at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CriticalDeviceDatabase. When Windows is starting up, it loads the drivers in the CDDB. When it’s done, it should have access to the boot volume because the boot volume’s controller driver would be in the CDDB. But what if it isn’t? Well, in that case you get STOP 0x7B. Windows is trying to tell you that it has loaded all drivers it has and that the boot volume is still not accessible.
On a normal computer, this is quite unpleasant because you can’t always get back into Windows to fix it. If it happens due to a BIOS upgrade, it’s likely that the bridge was updated and you need a new driver prior to the BIOS upgrade. Check out this blog for more detailed info on the CDDB if you are interested. Luckily, with UCS, you can roll the profile back to an M2 with Palo and boot it right back up. Once you are back into Windows, you can fix this problem pretty easily. Here is the problem and solution:
When you install the OS from scratch and feed it the 2.0 or higher drivers during installation on the M2 with Palo, it builds the following tree in the CDDB:
However, if you installed with an earlier version of the drivers, the CDDB tree is not built the same and it does not create the entries for the 12×0/mlom and it looks like this:
This behavior is expected of course because the VIC 1240/1280 did not exist at that time you used those older drivers. However, the new Palo (M81KR) drivers contain all the CDDB tree info for both old and new VIC cards. To fix the problem, all you have to do is:
- Boot windows on the M2.
- Select (highlight) HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CriticalDeviceDatabase\PCI#VEN_1137&DEV_0045&SUBSYS_00481137
- Click File->Export and export this key to a file.
- Then open the reg file in notepad.
- Change the string 00481137 to 00841137 (you are just swapping the “4” and the “8”.
- Save the changes and close the file.
- Double click the reg file to import it into the registry.
- Shut down windows and move the profile to the M3.
I would not suggest editing the string in the registry directly as this would overwrite the only working string you have. Last Known Good would help here, and will also help if you mess anything up while doing the above steps. Just remember, Last Known Good is only valid until you successfully login.
I hope this article is helpful to you. Let me know if anything isn’t clear.
Side note: If you want more information on statelessness and how it benefits you, see this blog article. But briefly, take a NIC, for example, that typically has a MAC address burned in at the factory. If you install an application and it licenses itself to the MAC address you will find it painful to replace the NIC if it fails because the MAC address will change. Cisco UCS makes this simple by creating a portable wrapper for the application (and OS) to run in that contains the MAC address (called the Service Profile). The Service Profile contains a lot more than the MAC address (over 100 items of server identity can be stored in it) and because it’s portable, I can move it from physical server 1 to physical server 2 and the application sees no changes. Think of it as virtual machines technology for the physical side of the datacenter. This capability is not unique to Cisco (HP, Dell, and IBM all have some aspect of it), but only UCS offers you complete control over every desired aspect of server identity.