So you have a standalone instance of SCVMM and you want to make it highly available.
Firstly, well done for recognizing the important of SCVMM and embarking on this journey
There are many guides out there on the step by step so I won’t reinvent the wheel here. What I will do though is give you some things to think about that have caught out some of my clients along the way…
Some prior reading material: Deploy VMM for HA on Docs
Just for completeness, the high level upgrade process from SA to HA is:
- Uninstall SCVMM opting to retain the database
- Create cluster and validate
- Install SCVMM on first cluster node, choosing to use HA
- Point to the existing VMM DB
- Login to SCVMM Console and validate it looks as it should
- Install SCVMM on second cluster node
I’m going to assume you’re SQL instance is HA. Or if not, you’re planning on moving that to HA at some stage. I will not cover that in this post.
As mentioned, the ‘how-to’ is not the intent of this post, but more the considerations that need to be made and the challenges some clients have faced when making the change.
The 5 key considerations I want to discuss are:
- Distributed Key Management
- Libraries and thus templates (this is the single most common issue that gets ignored when making SCVMM changes)
- Patch levels… Very important, example of a common challenge found here
- Security and Firewall configuration
- Run As Accounts
Based on your SCVMM maturity and implementation, some or many of these might not apply to you…
Distributed Key Management
First, what is the Distributed Key Management (DKM) for? For encrypting stuff… Satisfied? No… well you’ll probably want to read this doc about DKM and it’s purpose in VMM.
When most standalone deployments of VMM occur, the keys end up on the VMM server. This will work just fine for SA deployments, but in a HA config, each node must have access to the keys. So part of the HA deployment requirements is to store these keys in a container in AD.
The important part to consider here is the AD access to create and work within the container.
My recommendation is to always place this container on the root of the domain. This will go along way to ensure no entrepreneurial AD/Security admins don’t go breaking access.
I always use the default. Example: CN=VMMDKM,DC=corp,DC=contoso,DC=com
Easiest approach. Get the container created and give the installer and vmm service accounts full control. Done.
Most AD admins are fine with this as the permissions are limited to the specific container. Occasionally I find a AD/Security admin that has different views which is always a fun discussion.
Libraries and Templates
It is commonly found that growing environments (typically those that make the change from SA to HA VMM) tend to have their SCVMM Library on the SCVMM server.
Whilst this is technically fine, there are some things to consider here.
Most HA deployments are typically a pair of new VMM servers configured in a cluster. And after the move to the HA deployment is complete, the library is still the old SA SCVMM server.
Again this is fine, but most of the time, the customer wants to decom the old SCVMM server but if we have any VM templates referencing images from that library, we have to recreate the templates as there is no supported way to change the source vhdx in the template.
If you need HA library shares, then refer to the docs again on how to achieve this. Essentially do not use SOFS and use GUFS.
Patch Levels
Historically the patch levels of SCVMM servers compared to the Hyper-V hosts or Infrastructure servers have not been a huge concern, but in the past few months this has become a more common point to consider.
Why now? SCVMM uses CredSSP to communicate to its agents, so since the CredSSP updates for CVE-2018-0886 changes how CredSSP works, if the SCVMM or it’s connected agents are on different sides of this patch, the Host will appear as not responding… More info here
In short, if your hosts are all pre March 2018 update, then your new SCVMM server must be pre March as well. Or vice-versa…
Security and Firewall
This one is a very broad stroke but I will be very brief here.
In some highly ‘secure’ environments where there are many firewalls between SCVMM and Hosts. It has occasionally been delays in getting new IP’s (the cluster and new VMM servers) to firewalls etc.
So make sure you check all of you security settings thoroughly. Firewalls (both appliance and Windows), Group POlicies, WinRM settings, Run As Accounts etc
Run As Accounts
Keep in mind the Run As Accounts. I’ve seen it twice now in two separate environments where they had both the CredSSP issue and the Run As Accounts for managing the Hyper-V hosts got “locked’.
The root cause of the account being locked was never actually found (or really looked in to) but in both cases, the deplyment was faced with the CredSSP issue. Gut feel tells me there is a link but it is only speculation.
Also make sure the Run As Accounts have the required permissions on the VMM servers. My baseline is to create an AD group (e.g. SCVMM Local Admins) with the appropriate members and add this as local admin to the SCVMM servers.
As with all System Center products, proper planning and understanding are paramount to the success of these solutions. If you fail to plan, you plan to fail.
Happy SCVMM’ing 😀
Dan