Uploading / running utilities directly on ESXi hosts

As part of planning our upgrade from VMware NSX-V from v6.2.2 to v6.2.4 we became aware of the VMware issue KB2146171 (link) which can cause VMs to lose network connectivity when vMotioned to other hosts following the upgrade. Obviously wishing to avoid this for our own (and customer) VMs, we raised a support case to obtain the VMware script to determine how many of our VMs (if any) were going to be affected. Unfortunately the VMware script we were supplied was configured to run *after* the upgrade had already been completed. Fortunately the VMware utility supplied (vsipioctl – a binary to be run directly on ESXi hosts) could tell us which VMs were affected prior to upgrading.

Since we have a reasonably large number of hosts and hosted VMs I set about writing some PowerShell to perform the following actions:

  • Connect to vCenter and enumerate all ESXi hosts.
  • Enable SSH access to each host in turn.
  • Upload the VMware vsipioctl utility to the host /tmp/ folder and make it executable.
  • Run vsipioctl and parse the return information.
  • Build a table / CSV of all VM network interfaces with the results of the vsipioctl utility.
  • Disable SSH on the hosts once done and move on to the next host.

At first I tried using PuTTY plink.exe and pscp.exe from PowerShell to perform the SSH and SCP file copy to the hosts, but had serious problems passing the right password & command line options due to the way PowerShell escapes quoted strings. In the end I found it easier to use the PoshSSH PowerShell library (https://github.com/darkoperator/Posh-SSH) for these functions rather than shelling out to PuTTY executables.

Note that we usually leave SSH access disabled on our ESXi hosts, so the script shown enables this and then re-disables SSH after running – adjust if necessary when using in your own environments.

If you need to run this check for your own environment you will still need to open a VMware support call to obtain the vsipioctl binary as far as I am aware as I don’t believe this is available any other way.

The script is shown below – hopefully this will be useful for some of you, just make sure you test properly before running against a production environment. Luckily in our case the script proved that none of our VMs are impacted by this issue and we can safely proceed with our NSX-V upgrade.

Jon.

 

 

 

 

Installing Microsoft Azure Stack TP2 on VMware ESXi

This week Microsoft released Technical Preview 2 (TP2) of their ‘Cloud in a box’ Azure Stack product. This is scheduled for release in mid-2017 to allow enterprises and service providers to run Azure consistent services from their own datacenters.

TP2 has a number of additional features over TP1 released earlier this year, but doesn’t support installation as a virtual machine. The hardware requirements are detailed here but basically you’re going to need a reasonably good spec server with enough local hard disks to be able to install it.

As I had good success running the previous TP1 release of Azure Stack in a virtual machine I thought I’d see if the same could be done with TP2. As with TP1, installation is only supported for a single machine node (clustered multi-node deployments are likely to come with TP3).

Of course installing TP2 as a VM is completely unsupported by both Microsoft and VMware so please don’t bug them with any issues – since TP2 is definitely not for production use this shouldn’t be a huge concern.

After several failed attempts I finally worked out a method to allow installation of TP2 as a virtual machine using VMware ESXi 6.0 Update 2 as the hypervisor platform. The key is in building the host virtual machine correctly and in modifying a couple of places in the installation PowerShell scripts to bypass the checks for physical hardware.

To start, create a new virtual hardware v11 Windows VM with appropriate sizing (I used a 200GB system disk, 128GB of RAM and 12 CPU cores configured as a single socket / 12 cores arrangement).

I then made the following changes:

  • Add a new SCSI host bus adapter and set the ‘bus sharing’ for this adapter to ‘Physical’ – this is required to allow the Storage Spaces Direct (S2D) configuration in the Azure Stack installer to correctly configure clustered storage.
  • Make sure that ‘Expose hardware assisted virtualization to the guest OS’ option is enabled to allow the VM to run the Hyper-V role and nested VMs.
  • Add 4 new virtual hard disks of at least 150GB size each (I used 200GB for each disk again) and configure these as ‘Thick provisioned eager zeroed’ and make sure they are attached to the new (physical bus sharing) SCSI adapter.
  • Use a single VMXNET3 network adapter connected to a network that has a DHCP server available on it.
  • Change portgroup security for the network to which the VM is attached to allow ‘Promiscuous Mode’, ‘MAC address changes’ and ‘Forged Transmits’.
  • Set the VM to boot to BIOS on next power up and when it boots make sure to set the BIOS date/time to match your current timezone date/time.

Next power on and install a base operating system on the VM (I used Server 2012 R2, but it really doesn’t matter as this environment is only used to bootstrap the installation process).

Once the server is running, download  and unpack Azure Pack TP2 and move the extracted ‘CloudBuilder.vhdx’ file to the root of the C:\ drive. Following the Microsoft instructions to download and run the ‘PrepareBootFromVHD.ps1’ script which will reconfigure the VM to boot from the CloudBuilder.vhdx file and restart the VM.

Note: Depending on disk speed It can take a considerable time to extract CloudBuilder.vhdx from the TP2 archive – you might want to keep a copy of it elsewhere on your network (or on the VM disk if you have space) in case you need to restart the installation from scratch.

Once the VM is up and running from the TP2 CloudBuilder vhdx image, make the following changes:

  • Install VMware Tools (required to add the VMXNET3 network driver) and restart when prompted. – See note below, E1000 network adapter may be a better choice.
  • (Optional) Rename the computer and restart when prompted.
  • (Optional) Change the VM’s IP address to a static IPv4 address (rather than just using DHCP) so you can easily locate it on the network later – note that DHCP is still a requirement for the other VMs unless you use the Microsoft documented installer switches to allow use of a static addresses.
  • Make sure that the date/time and timezone are set correctly and match the VM BIOS setting (Can’t stress this enough, I had at least 3 failed installation attempts due to date/time problems).
  • Make the following changes to the C:\CloudDeployment\Roles\PhysicalMachines\Tests\BareMetal.Tests.ps1 file:

Line 376:
Change:
$physicalMachine.IsVirtualMachine | Should Be $false
To:
$false | Should Be $false

Line 453:
Change:
($physicalMachine.Processors.NumberOfEnabledCores | Measure-Object -Sum).Sum | Should Not BeLessThan $minimumNumberOfCoresPerMachine
To:
12 | Should Not BeLessThan $minimumNumberOfCoresPerMachine

Then save the file. (The second change should not be necessary if you’ve built the VM with at least 12 cores, but the installation script appears to detect the number of physical cores as ‘0’ in a VM so this is required).

You should now be able to run the CloudDeployment\Configuration\InstallAzureStackPOC.ps1 script and everything should work…..

The installation process will take a considerable time, but hopefully if you’ve configured everything correctly you’ll have a working Azure Stack TP2 installation at the end with all of the required infrastructure servers running as Hyper-V guests within the VM.

NOTE: I hit an issue with installation failing at step 60.61.93 and thought this was related to installing in a VM, but it appears this is a more general issue with TP2 installation – see this MSDN thread for possible solutions if you encounter this error. If you encounter any issues with the installation I also recommend following the troubleshooting advice here.

Best of luck trying this out for yourselves!

Update 7th Oct 2016: If you’re having issues with guest (nested Hyper-V) VMs crashing, try using the E1000 network adapter for the host instead of VMXNET3, I’ve been doing some testing with this and E1000 may be a better option and prevent this occurring.

Jon

Client Integration Plugin madness

One of the frustrations dealing with the vSphere Web Client has always been the requirement for a browser plugin to import/export OVF templates. In vSphere 6 and vCloud Director 8 this has reached a whole new level of frustration. The issue is that both vSphere 6 and vCloud Director 8 offer to download and install a package called ‘VMware-ClientIntegrationPlugin-6.0.0.exe’. In an ideal world this plugin once correctly installed would work for the OVF import/export/upload/download functionality in both products… right?

Meanwhile in this world, although the package names are identical, the functionality is not – if you’ve installed the vSphere version then you can’t upload/download OVFs or ISOs in vCloud Director and if you’ve got the vCD variant installed then vSphere OVF import/export doesn’t work. Uninstalling and reinstalling the ‘correct’ version fixes the problem (until you need the ‘other’ one again), but can be easier said than done – particularly in the case of shared desktop server administration environments where other users having a browser session open will prevent reinstalling browser plugins.

So how do you tell the two packages apart? In the current releases of vSphere 6.0Update1 and vCloud Director v8.0 the packages are significantly different sizes:

vCloud Director 8 vSphere 6.0 Update 1
Package Filename VMware-ClientIntegrationPlugin-6.0.0.exe VMware-ClientIntegrationPlugin-6.0.0.exe
File Version 11.0.0.2826 10.0.0.3637
Product Version 6.0.0.2826 6.0.0.3637
File Size 48.8 MB 94.9 MB

So the easiest way to tell is the smaller 48.8 MB file is vCD and the larger 94.9 MB one is vSphere.

If (as I do) you often need to use both versions then maybe consider setting up separate management desktops (or virtual apps) for each so you can easily reach one that’s going to work for you.

Hopefully VMware will fix this in a future release and provide a single integration plugin that works across both products.

Update – 18th March 2016
VMware have just released vCloud Director for Service Providers v8.0.1 (http://pubs.vmware.com/Release_Notes/en/vcd/801/rel_notes_vcloud_director_801.html) which appears to have reverted the vCD Client Integration product to version 5.6.0 – there is also mention in the release notes on the possible clashes between vSphere and vCD client integration toolsets so it appears that VMware are at least aware of the issue.