Before you can deploy a vSphere Lifecycle Manager (vLCM) image based cluster in VMware Cloud Foundation, you must first import an image into the Image Management Inventory in SDDC Manager. You can do this via the SDDC Manager UI for a pre existing cluster.
Or you can now use PowerVCF to import the image thanks to the addition of New-VCFPersonality (vLCM images are known as personalities in VCF hence the name of the cmdlet).
The sequence of events to be able to import an image is as follows:
Extract a vLCM image from a host that you wish to use in the workload domain. The host doesn’t need to be in the vCenter or SDDC Manager inventory
Create a temporary cluster in vCenter (must be created in a VCF workload domain) and assign the image from the previous step.
Import the image from the source cluster into SDDC Manager
To achieve step 1 we can use PowerCLI
# Variables
$sourceHostUrl = "https://sfo01-w01-esx01.sfo.rainpole.io"
$sourceHostBuild = "21495797"
$sourceHostRootPassword = "VMw@re1!"
$vcenterFQDN = "sfo-m01-vc01.sfo.rainpole.io"
$ssoUsername = "administrator@vsphere.local"
$ssoPassword = "VMw@re1!"
$vcenterDC = "sfo-m01-dc01"
$sddcManagerFQDN = "sfo-vcf01.sfo.rainpole.io"
# Retrieve the source host thumbprint
$response = [System.Net.WebRequest]::Create($sourceHostUrl)
$response.GetResponse()
$cert = $response.ServicePoint.Certificate
$sourceHostThumbprint = $cert.GetCertHashString() -replace '(..(?!$))','$1:'
# Connect to vCenter and import the image from the source host to the depot
connect-viserver -server $vcenterFQDN -user $vcenterUsername -password $vcenterPassword
$OfflineHostCredentials = Initialize-SettingsDepotsOfflineHostCredentials -HostName $sourceHostUrl -UserName "root" -Password $sourceHostRootPassword -Port 443 -SslThumbPrint $sourceHostThumbprint
$OfflineConnectionSpec = Initialize-SettingsDepotsOfflineConnectionSpec -AuthType "USERNAME_PASSWORD" -HostCredential $OfflineHostCredentials
Invoke-CreateFromHostDepotsOfflineAsync -SettingsDepotsOfflineConnectionSpec $SettingsDepotsOfflineConnectionSpec
# Create a temporary cluster and assign the image
$LcmImage = Get-LcmImage -Type BaseImage | where {$_.Version -match $sourceHostBuild}
$clusterID = (New-Cluster -Location $vcenterDC -Name 'vLCM-Cluster' -HAEnabled -DrsEnabled -BaseImage $LcmImage).ExtensionData.MoRef.Value
# Import the image to SDDDC Manager
Request-VCFToken -fqdn $sddcManagerFQDN -username $ssoUsername -password $ssoPassword
$vCenterID = (Get-VCFvCEnter | where {$_.fqdn -match $vcenterFQDN}).id
New-VCFPesonality -name "21495797" -vCenterId $vCenterID -clusterId $clusterID
That should import the new image into the SDDC Manager image repo for use creating a vLCM image based workload domain.
Since the introduction of subscription based licensing for VMware Cloud Foundation (VCF+) there are now 2 licensing modes in VCF (Perpetual or Subscription). To make it easier to identify the subscription status of the system and each workload domain we have added support for Get-VCFLicenseMode into the latest release of PowerVCF 2.3.0.1004.
Virtual NVMe isn’t a new concept. It’s been around since the 6.x days. As part of some lab work I needed to automate adding an NVMe controller and some devices to a VM. This can be accomplished using the PowerCLI cmdlets for the vSphere API.
# NVME Using PowerCLI
$vcenterFQDN = "sfo-m01-vc01.sfo.rainpole.io"
$vcenterUsername = "administrator@vsphere.local"
$vcenterPassword = "VMw@re1!"
$vms = @("sfo01-w01-esx01","sfo01-w01-esx02","sfo01-w01-esx03","sfo01-w01-esx04")
# Install the required module
Install-Module VMware.Sdk.vSphere.vCenter.Vm
# Connect to vCenter
connect-viserver -server $vcenterFQDN -user $vcenterUsername -password $vcenterPassword
# Add an NVMe controller to each VM
Foreach ($vmName in $vms)
{
$VmHardwareAdapterNvmeCreateSpec = Initialize-VmHardwareAdapterNvmeCreateSpec -Bus 0 -PciSlotNumber 0
Invoke-CreateVmHardwareAdapterNvme -vm (get-vm $vmName).ExtensionData.MoRef.Value -VmHardwareAdapterNvmeCreateSpec $VmHardwareAdapterNvmeCreateSpec
}
# Add an NVMe device
Foreach ($vmName in $vms)
{
$VmHardwareDiskVmdkCreateSpec = Initialize-VmHardwareDiskVmdkCreateSpec -Capacity 274877906944
$VmHardwareDiskCreateSpec = Initialize-VmHardwareDiskCreateSpec -Type "NVME" -NewVmdk $VmHardwareDiskVmdkCreateSpec
Invoke-CreateVmHardwareDisk -Vm (get-vm $vmName).ExtensionData.MoRef.Value -VmHardwareDiskCreateSpec $VmHardwareDiskCreateSpec
}
Once i got my head around the basics of Terraform I wanted to play with the vSphere provider to see what its was capable of. A basic use case that everyone needs is to deploy a VM. So my first use case is to deploy a VM from an OVA. The vSphere provider documentation for deploying an OVA uses William Lam’s nested ESXi OVA as an example. This is a great example of how to use the provider but seeing as I plan to play with the NSX-T provider also, I decided to use NSX-T Manager OVA as my source to deploy.
So first thing to do is setup your provider. Every provider in the Terraform registry has a Use Provider button on the provider page that pops up a How to use this provider box. This shows you what you need to put in your required_providers & provider block. In my case I will use a providers.tf file and it will look like the below example. Note you can only have one required_providers block in your configuration, but you can have multiple providers. So all required providers go in the same required_providers block and each provider has its own provider block.
To authenticate to our chosen provider (in this case vSphere) we need to provide credentials. If you read my initial post on Terraform you would have seen me mention a terraform.tfvars file which can be used for sensitive variables. We will declare these as variables later in the variables.tf file but this is where we assign the values. So my terraform.tfvars file looks like this
Next we need variables to enable us to deploy our NSX-T Manager appliance. So we create a variables.tf file and populate it with our variables. Note – variables that have a default value are considered optional and the default value will be used if no value is passed.
Now that we have our provider & variables in place we need a plan file to deploy the NSX-T Manager OVA, including the data sources we need to pull information from and the resource we are going to create.
<br />
# main.tf</p>
<p># Data source for vCenter Datacenter<br />
data "vsphere_datacenter" "datacenter" {<br />
name = var.data_center<br />
}</p>
<p># Data source for vCenter Cluster<br />
data "vsphere_compute_cluster" "cluster" {<br />
name = var.cluster<br />
datacenter_id = data.vsphere_datacenter.datacenter.id<br />
}</p>
<p># Data source for vCenter Datastore<br />
data "vsphere_datastore" "datastore" {<br />
name = var.workload_datastore<br />
datacenter_id = data.vsphere_datacenter.datacenter.id<br />
}</p>
<p># Data source for vCenter Portgroup<br />
data "vsphere_network" "mgmt" {<br />
name = var.mgmt_pg<br />
datacenter_id = data.vsphere_datacenter.datacenter.id<br />
}</p>
<p># Data source for vCenter Resource Pool. In our case we will use the root resource pool<br />
data "vsphere_resource_pool" "pool" {<br />
name = format("%s%s", data.vsphere_compute_cluster.cluster.name, "/Resources")<br />
datacenter_id = data.vsphere_datacenter.datacenter.id<br />
}</p>
<p># Data source for ESXi host to deploy to<br />
data "vsphere_host" "host" {<br />
name = var.compute_host<br />
datacenter_id = data.vsphere_datacenter.datacenter.id<br />
}</p>
<p># Data source for the OVF to read the required OVF Properties<br />
data "vsphere_ovf_vm_template" "ovfLocal" {<br />
name = var.vm_name<br />
resource_pool_id = data.vsphere_resource_pool.pool.id<br />
datastore_id = data.vsphere_datastore.datastore.id<br />
host_system_id = data.vsphere_host.host.id<br />
local_ovf_path = var.local_ovf_path<br />
ovf_network_map = {<br />
"Network 1" = data.vsphere_network.mgmt.id<br />
}<br />
}</p>
<p># Deployment of VM from Local OVA<br />
resource "vsphere_virtual_machine" "nsxt01" {<br />
name = var.vm_name<br />
datacenter_id = data.vsphere_datacenter.datacenter.id<br />
datastore_id = data.vsphere_ovf_vm_template.ovfLocal.datastore_id<br />
host_system_id = data.vsphere_ovf_vm_template.ovfLocal.host_system_id<br />
resource_pool_id = data.vsphere_ovf_vm_template.ovfLocal.resource_pool_id<br />
num_cpus = data.vsphere_ovf_vm_template.ovfLocal.num_cpus<br />
num_cores_per_socket = data.vsphere_ovf_vm_template.ovfLocal.num_cores_per_socket<br />
memory = data.vsphere_ovf_vm_template.ovfLocal.memory<br />
guest_id = data.vsphere_ovf_vm_template.ovfLocal.guest_id<br />
scsi_type = data.vsphere_ovf_vm_template.ovfLocal.scsi_type<br />
dynamic "network_interface" {<br />
for_each = data.vsphere_ovf_vm_template.ovfLocal.ovf_network_map<br />
content {<br />
network_id = network_interface.value<br />
}<br />
}</p>
<p> wait_for_guest_net_timeout = 5</p>
<p> ovf_deploy {<br />
allow_unverified_ssl_cert = true<br />
local_ovf_path = var.local_ovf_path<br />
disk_provisioning = "thin"<br />
deployment_option = var.deployment_option</p>
<p> }<br />
vapp {<br />
properties = {<br />
"nsx_role" = var.nsx_role,<br />
"nsx_ip_0" = var.nsx_ip_0,<br />
"nsx_netmask_0" = var.nsx_netmask_0,<br />
"nsx_gateway_0" = var.nsx_gateway_0,<br />
"nsx_dns1_0" = var.nsx_dns1_0,<br />
"nsx_domain_0" = var.nsx_domain_0,<br />
"nsx_ntp_0" = var.nsx_ntp_0,<br />
"nsx_isSSHEnabled" = var.nsx_isSSHEnabled,<br />
"nsx_allowSSHRootLogin" = var.nsx_allowSSHRootLogin,<br />
"nsx_passwd_0" = var.nsx_passwd_0,<br />
"nsx_cli_passwd_0" = var.nsx_cli_passwd_0,<br />
"nsx_cli_audit_passwd_0" = var.nsx_cli_audit_passwd_0,<br />
"nsx_hostname" = var.nsx_hostname<br />
}<br />
}<br />
lifecycle {<br />
ignore_changes = [<br />
#vapp # Enable this to ignore all vapp properties if the plan is re-run<br />
vapp[0].properties["nsx_role"], # Avoid unwanted changes to specific vApp properties.<br />
vapp[0].properties["nsx_passwd_0"],<br />
vapp[0].properties["nsx_cli_passwd_0"],<br />
vapp[0].properties["nsx_cli_audit_passwd_0"],<br />
host_system_id # Avoids moving the VM back to the host it was deployed to if DRS has relocated it<br />
]<br />
}<br />
}<br />
Once we have all of the above we can run the following to validate our plan
terraform plan -out=nsxt01
If your plan is successful you should see an output similar to below
Once your plan is successful run the command below to apply the plan
terraform apply nsxt01
If the stars align your NSX-T Manager appliance should deploy successfully. Once its deployed, if you were to re-run the plan you should see a message similar to below
One of the key pieces to this is the lifecycle block in the plan. The lifecycle block enables you to callout things that Terraform should ignore when it is re-applying a plan. Things like tags or other items that may get updated by other systems etc. In our case we want Terraform to ignore the vApp properties as it will try to apply password properties every time, which would entail powering down the VM, making the change, and powering the VM back on.
Along with the release of VMware Cloud Foundation 4.3.1, we are excited to announce the general availability of the Site Protection & Disaster Recovery for VMware Cloud Foundation Validated Solution. The solution documentation, intro and other associated collateral can be found on the Cloud Platform Tech Zone here.
The move from VMware Validated Designs to VMware Validated Solutions has been covered by my team mate Gary Blake in detail here so I wont go into that detail here. Instead I will concentrate on the work Ken Gould and I (along with a supporting team) have been working to deliver for the past few months.
The Site Protection & Disaster Recovery for VMware Cloud Foundation Validated Solution includes the following to deliver an end-to-end validated way to protect your mission critical applications. You get a set of documentation that is tailored to the solution that includes: design objectives, a detailed design including not just design decisions, but the justifications & implications of those decisions, detailed implementation steps with PowerShell alternatives for some steps to speed up time to deploy, operational guidance on how to use the solution once its deployed, solution interoperability between it and other Validated Solutions, an appendix containing all the solution design decisions in one easy place for review, and finally, a set of frequently asked questions that will be updated for each release.
Disaster recovery is a huge topic for everyone lately. Everything from power outages to natural disasters to ransomware and beyond can be classed as a disaster, and regardless of the type of disaster you must be prepared. To adequately plan for business continuity in the event of a disaster you must protect your mission critical applications so that they may be recovered. In a VMware Cloud Foundation environment, cloud operations and automation services are delivered by vRealize Lifecycle Manager, vRealize Operations Manager & vRealize Automation, with authentication services delivered by Workspace ONE Access.
To provide DR for our mission critical apps we leverage 2 VCF instances with NSX-T federation between them. The primary VCF instance runs the active NSX-T global manager and the recovery VCF instance runs the standby NSX-T global manager. All load balancing services are served from the protected instance, with a standby load balancer (disconnected from the recovery site NSX Tier-1 until required, to avoid IP conflicts) in the recovery instance. Using our included PowerShell cmdlets you can quickly create and configure the standby load balancer to mimic your active load balancer, saving you a ton of manual UI clicks.
In the (hopefully never) event of the need to failover the cloud management applications, you can easily bring the standby load balancer online to enable networking services for the failed over applications.
Using Site recovery Manager (SRM) you can run planned migrations or disaster recovery migrations. With a single set of SRM recovery plans, regardless of the scenario, you will be guided through the recovery process. In this post I will cover what happens in the event of a disaster.
When a disaster occurs on the protected site (once the panic subsides) there are a series of tasks you need to perform to bring those mission critical apps back online.
First? Fix the network! Log into the passive NSX Global Manager (GM) on the recovery site and promote the GM to Active. (Note: This can take about 10-15 mins)
To cover the case of an accidental “Force Active” click..we’ve built in the “Are you absolutely sure this is what you want to do?” prompt!
Once the promotion operation completes our standby NSX GM is now active, and can be used to manage the surviving site NSX Local Manager (LM)
Once the recovery site GM is active we need to ensure that the cross-instance NSX Tier-1 is now directing the egress traffic via the recovery site. To do this we must update the locations on the Tier-1. Navigate to GM> Tier-1 gateways > Cross Instance Tier-1. Under Locations, make the recovery location Primary.
The next step is to ensure we have an active load balancer running in the recovery site to ensure our protected applications come up correctly. To do this log into what is now our active GM, select the recovery site NSX Local Manager (LM), and navigate to Networking > Load Balancing. Edit the load balancer and attach it to the recovery site standalone Tier-1.
At this point we are ready to run our SRM recovery plans. The recommended order for running the recovery plans (assuming you have all of the protected components listed below) is as follows. This ensures lifecycle & authentication services (vRSLCM & WSA) are up before the applications that depend on them (vROPS & vRA)
vRSLCM – WSA – RP
Intelligent Operations Management RP
Private Cloud Automation RP
I’m not going to go through each recovery plan in detail here. They are documented in the Site Protection and Disaster Recovery Validated Solution. In some you will be prompted to verify this or that along the way to ensure successful failover.
The main thing in a DR situation is, DO NOT PANIC. And what is the best way to getting to a place where you DO NOT PANIC? Test your DR plans…so when you see this…
Your reaction is this…
Trust the plan…test the plan…relax…you have a plan!
Hopefully this post was useful..if you want to learn more please reach out in the comments…if you’re attending VMworld and would like to learn more or ask some questions, please drop into our Meet The Experts session on Thursday.
In Part 1 of this series we saw how to retrieve a sessionId from the Site Recovery Manager VAMI interface using Postman & Powershell. In this post we will use that sessionId to replace the appliance SSL certificate using the API. To start we again use the VAMI UI to inspect the endpoint URL being used for certificate replacement by doing a manual replacement. In this case the URL is:
Site Recovery Manager expects the certificate in P12 format so I used CertGen to create the cert format needed. When using the UI you browse to the cert file and it uploads in the browser along with the certificate passphrase. Behind the scenes it is then base64 encoded, so you need to do this before using the API.
Within a VMware Cloud Foundation instance, SDDC Manager is used to manage the lifecycle of passwords (or credentials). While we provide the ability to rotate (either scheduled or manually) currently there is no easy way to check when a particular password is due to expire, which can lead to appliance root passwords expiring, which will cause all sorts of issues. The ability to monitor expiry is something that is being worked on, but as a stop gap I put together the script below which leverages PowerVCF and also a currently undocumented API for validating credentials.
The script has a function called Get-VCFPasswordExpiry that accepts the following parameters
-fqdn (FQDN of the SDDC Manager)
-username (SDDC Manager Username – Must have the ADMIN role)
-password (SDDC Manager password)
-resourceType (Optional parameter to specify a resourceType. If not passed, all resources will be checked. If passed (e.g. VCENTER) then only that resourceType will be checked. Supported resource types are
PowerVCF is a requirement. If you dont already have it run the following
Install-Module -Name PowerVCF
The code takes a while to run as it needs to do the following to check password expiry
Connect to SDDC Manager to retrieve an API token
Retrieve a list of all credentials
Using the resourceID of each credential
Perform a credential validation
Wait for the validation to complete
Parse the results for the expiry details
Add all the results to an array and present in a table (Kudos to Ken Gould for assistance with the presentation of this piece!)
In this example script I am returning all non SERVICE user accounts regardless of expiry (SERVICE account passwords are system managed). You could get more granular by adding something like this to only display accounts with passwords due to expire in less than 14 days
if ($validationTaskResponse.validationChecks.passwordDetails.numberOfDaysToExpiry -lt 14) {
Write-Output "Password for username $($validationTaskResponse.validationChecks.username) expires in $($validationTaskResponse.validationChecks.passwordDetails.numberOfDaysToExpiry) days"
}
Traditionally VMware Cloud Foundation (VCF) has followed the hybrid approach when it comes to SSL certificate management. Hybrid mode essentially means using CA signed certs for the vCenter Server machineSSL cert, and VMCA signed certs for the solution user certs. In this mode, ESXi host certs are VMCA managed also. You then have the option to integrate with an external Microsoft CA or continue to use VMCA for all certs. If you decide to integrate with a Microsoft CA, ESXi host certs remain VMCA managed. This is not always ideal as some customers require all components on the network to be signed by a known & trusted CA. Up until the recent 4.1 VMware Cloud Foundation (VCF) release it was not possible to use custom CA signed certs on your ESXi hosts, as hybrid mode would overwrite your CA signed ESXi certs with VMCA signed certs. There is a great blog post here on how to manually enable CA signed certs here but with VCF 4.1 it is now supported to do this via the API during bringup. The procedure is as follows:
Install the ESXi hosts that will be used for bringup with the ESXi version on the Bill Of Materials for 4.1
Install your custom CA signed certs on each host that will be used for the management domain
Log in to the ESXi Shell, either directly from the DCUI or from an SSH client, as a user with administrator privileges.
In the directory /etc/vmware/ssl, rename the existing certificates using the following commands.
mv rui.crt orig.rui.crt
mv rui.key orig.rui.key
Copy the certificates that you want to use to /etc/vmware/ssl.
Rename the new certificate and key to rui.crt and rui.key.
Restart the host management agents by running the following commands
Repeat the above steps for all management domain hosts
To ensure that SDDC Manager is aware that you are using custom certs you need to add a flag in the bringup json along with the PEM encoded signing chain certificate, so that it is added to the SDDC Manager keystore. This will ensure the certificates are trusted. The API guide for 4.1 provides an example json spec here. Pay particular attention to this section
You can then follow the steps outlined in the API guide to deploy the management domain using the Cloud Builder API. Note that once custom mode is enabled, all future workload domains that you create must also use signed certs.
VMware Skylineâ„¢ is a proactive support service aligned with VMware Global Support Services. VMware Skyline automatically and securely collects, aggregates, and analyzes product usage data which proactively identifies potential problems and helps VMware Technical Support Engineers improve the resolution time.
With the release of VMware Skyline collector 2.6, it is now supported to add VMware Cloud Foundation awareness to Skyline advisor to ensure that Skyline findings and recommendations take into account, and do not violate the VMware Cloud Foundation design or Bill Of Materials (BOM). To enable this integration you add the SDDC Manager from each Cloud Foundation instance. When a vCenter Server, NSX-T Manager, & vRealize Operations Manager from that VMware Cloud Foundation instance is added they are automatically associated with the SDDC Manager and tagged for VCF based recommendations in Skyline Advisor.
To add Cloud Foundation to Skyline you need to do the following
Add a user in SDDC Manager and assign the VMware Cloud Foundation Viewer role.
Configure the VMware Skyline Collector to add the SDDC Manager instance by entering the FQDN & credentials for the above user
Once added SDDC Manager will show in the collector inventory view
Logging into Skyline Advisor in VMware Cloud Services you now see VMware Cloud Foundation listed as part of the inventory on the dashboard.
Navigating to the Inventory tab enables you to expand the VMware Cloud Foundation view to see the associated Cloud Foundation inventory
VMware Cloud Foundation
SDDC Manager
Management Domain
Workload Domain(s)
vRealize Operations Manager
Skyline findings and recommendations for these associated inventory items will now be surfaced as solution-based Proactive Findings.