vCloud Director 9.7 Portal Customization

One of the nicest additions to the new VMware vCloud Director 9.7 release is the ability to more fully customize the tenant portal. This now includes the capability to define custom links (together with section groupings / separators) and also the capability to customize the portal (and links) on a per-tenant basis:


To help take advantage of this, I’ve updated my vcd-h5-themes module on Github to understand the new capabilities in vCloud Director v9.7 (API version 32.0) to allow easier manipulation of the portal branding configuration options.

In particular the ‘Set-Branding’ cmdlet can now take a PSObject parameter with the customization links to be overridden or added to the portal (more about what this means later), it will also now take an optional parameter to limit the scope to a single vCD tenant organization (rather than applying changes to the system-default branding).

The ‘Get-Branding’ cmdlet has also been updated and can now retrieve either the global default branding, or the branding from a specific tenant organization.

There are actually 2 types of links that can be specified:

1) The ‘Help’ and the ‘About’ links under the circled ? icon can be redirected to other sites/pages (rather than showing the default VMware pages)

2) The menu under the current username (highlighted in red above) can be extended with any number of new sections, separators and links to other pages.

The way these are performed is slightly different, but both are placed into the customLinks object passed to Set-Branding.

Worked example

Let’s say that we want to make the following changes to the portal links:

– The ‘About’ link under the ? icon should redirect to our company about page at instead of the default VMware ‘About’ page.

– The Extensible menu under the username drop-down should have the following structure:

+– Help Desk (redirecting to
+– Contact Us (redirecting to mailto with a subject line of ‘Web Support’)
—– (Separator)
+– Other services (redirecting to
—– (Separator)
Terms & Conditions (redirecting to

To create these changes, we need to build a customLink object in PowerShell that reflects this arrangement, the code to do this is shown below. Running this code will create a PowerShell object variable ‘$mylinks’ which can then be passed to the Set-Branding cmdlet:

### Create $mylinks variable with our branding menu structure
$mylinks = [PSCustomObject]@(
    # Override the default 'about' link to redirect to
    # Add the section name 'Support':
    # Add the 'Help Desk' link:
        name="Help Desk";
    # Add the 'Contact Us' link:
        name="Contact Us";
        url=" Support"
    # Add the Separator:
    # Add the 'Services' group:
    # Add the 'Other services' link:
        name="Other services";
    # Add the 2nd Separator:
    # Add the 'Terms & Conditions' link:
        name="Terms & Conditions";
### End of File ###

The syntax is a bit fiddly here – in particular make sure that you place quote marks around each value as shown above – it may be easier to copy this script and edit the values rather than creating from scratch.

To test the object has been created successfully prior to configuring the portal, you can do ‘ConvertTo-Json $mylinks’ which should show a well-formatted JSON object if everything is correct:

     "menuItemType": "override",
     "url": "",
     "name": "about"
     "menuItemType": "section",
     "name": "Support"
     "menuItemType": "link",
     "url": "",
     "name": "Help Desk"
     "menuItemType": "link",
     "url": " Support",
     "name": "Contact Us"
     "menuItemType": "separator"
     "menuItemType": "section",
     "name": "Services"
     "menuItemType": "link",
     "url": "",
     "name": "Other services"
     "menuItemType": "separator"
     "menuItemType": "link",
     "url": "",
     "name": "Terms & Conditions"

To set our branding (make sure you use Connect-CIServer to connect to the appropriate cloud first in the ‘System’ context) then:

Set-Branding -customLinks $mylinks
Branding configuration sent successfully.

You can also use the ‘-Tenant’ switch to apply the changes to a specific tenant organization only.

When we look in the vCD HTML5 portal clicking on our username in the top-right of the portal we can now see our new link structure in place:


In addition, the ‘About’ option under the menu obtained by clicking the circled ? will now redirect to our own site:


Dynamic Persistent Volumes with CSE Kubernetes and Ceph


Application containerization with Docker is fast becoming the default deployment pattern for many business applications and Kubernetes (k8s) the method of managing these workloads. While containers generally should be stateless and ephemeral (able to be deployed, scaled and deleted at will) almost all business applications require data persistence of some form. In some cases it is appropriate to offload this to an external system (a database, file store or object store in public cloud environments are common for example).

This doesn’t cover all storage requirements though, and if you are running k8s in your own environment or in a hosted service provider environment you may not have access to compatible or appropriate storage. One solution for this is to build a storage platform alongside a Kubernetes cluster which can provide storage persistence while operating in a similar deployment pattern to the k8s cluster itself (scalable, clustered, highly available and no single points of failure).

VMware Container Service Extension (CSE) for vCloud Director (vCD) is an automated way for customers of vCloud powered service providers to easily deploy, scale and manage k8s clusters, however CSE currently only provides a limited storage option (an NFS storage server added to the cluster) and k8s persistent volumes (PVs) have to be pre-provisioned in NFS and assigned to containers/pods rather than being generated on-demand. This can also cause availability, scale and performance issues caused by the pod storage being located on a single server VM.

There is certainly no ‘right’ answer to the question of persistent storage for k8s clusters – often the choice will be driven by what is available in the platform you are deploying to and the security, availability and performance requirements for this storage.

In this post I will detail a deployment using a ceph storage cluster to provide a highly available and scalable storage platform and the configuration required to enable a CSE deployed k8s cluster to use dynamic persistent volumes (DPVs) in this environment.

Due to the large number of servers/VMs involved, and the possibility of confusion / working on the wrong server console – I’ve added buttons like this prior to each section to show which system(s) the commands should be used on.


I am not an expert in Kubernetes or ceph and have figured out most of the contents in this post from documentation, forums, google and (sometimes) trial and error. Refer to the documentation and support resources at the links at the end of this post if you need the ‘proper’ documentation on these components. Please do not use anything shown in this post in a production environment without appropriate due diligence and making sure you understand what you are doing (and why!).

Solution Overview

Our solution is going to be based on a minimal viable installation of ceph with a CSE cluster consisting of 4 ceph nodes (1 admin and 3 combined OSD/mon/mgr nodes) and a 4 node Kubernetes cluster (1 master and 3 worker nodes). There is no requirement for the OS in the ceph cluster and the kubernetes cluster to be the same, however it does make it easier if the packages used for ceph are at the same version which is easier to achieve using the same base OS for both clusters. Since CSE currently only has templates for Ubuntu 16.04 and PhotonOS, and due to the lack of packages for the ‘mimic’ release of ceph on PhotonOS, this example will use Ubuntu 16.04 LTS as the base OS for all servers.

The diagram below shows the components required to be deployed – in the lab environment I’m using the DNS and NTP servers already exist:

solution overview

Note: In production ceph clusters, the monitor (mon) service should run on separate machines from the nodes providing storage (OSD nodes), but for a test/dev environment there is no issue running both services on the same nodes.


You should ensure that you have the following enabled and configured in your environment before proceeding:

Configuration ItemRequirement
DNSHave a DNS server available and add host (‘A’) records for each of the ceph servers. Alternatively it should be possible to add /etc/hosts records on each node to avoid the need to configure DNS. Note that this is only required for the ceph nodes to talk to each other, the kubernetes cluster uses direct IP addresses to contact the ceph cluster.
NTPHave an available NTP time source on your network, or access to external ntp servers
Static IP PoolContainer Service Extension (CSE) requires a vCloud OrgVDC network with sufficient addresses available from a static IP pool for the number of kubernetes nodes being deployed
SSH Key PairGenerated SSH key pair to be used to administer the deployed CSE servers. This could (optionally) also be used to administer the ceph servers
VDC CapacityEnsure you have sufficient resources (Memory, CPU, Storage and number of VMs) in your vCD VDC to support the desired cluster sizes

Ceph Storage Cluster

The process below describes installing and configuring a ceph cluster on virtualised hardware. If you have an existing ceph cluster available or are building on physical hardware it’s best to follow the ceph official documentation at this link for your circumstances.

Ceph Server Builds

The 4 ceph servers can be built using any available hardware or virtualisation platform, in this exercise I’ve built them from an Ubuntu 16.04 LTS server template with 2 vCPUs and 4GB RAM for each in the same vCloud Director environment which will be used for deployment of the CSE kubernetes cluster. There are no special requirements for installing/configuring the base Operating System for the ceph cluster. If you are using a different Linux distribution then check the ceph documentation for the appropriate steps for your distribution.

On the 3 storage nodes (ceph01, ceph02 and ceph03) add a hard disk to the server which will act as the storage for the ceph Object Storage Daemon (OSD) – the storage pool which will eventually be useable in Kubernetes. In this example I’ve added a 50GB disk to each of these VMs.

Once the servers are deployed the following are performed on each server to update their repositories and upgrade any modules to current security levels. We will also upgrade the Linux kernel to a more up-to-date version by enabling the Ubuntu Hardware Extension (HWE) kernel which resolves some compatibility issues between ceph and older Linux kernel versions.

$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install --install-recommends linux-generic-hwe-16.04 -y

Each server should now be restarted to ensure the new Linux kernel is loaded and any added storage disks are recognised.

Ceph Admin Account

We need a user account configured on each of the ceph servers to allow ceph-deploy to work and to co-ordinate access, this account must NOT be named ‘ceph’ due to potential conflicts in the ceph-deploy scripts, but can be called just about anything else. In this lab environment I’ve used ‘cephadmin’. First we create the account on each server and set the password, the 3rd line permits the cephadmin user to use ‘sudo’ without a password which is required for the ceph-deploy script:

$ sudo useradd -d /home/cephadmin -m cephadmin -s /bin/bash
$ sudo passwd cephadmin
$ echo "cephadmin ALL = (root) NOPASSWD:ALL" > /etc/sudoers.d/cephadmin

From now on, (unless specified) use the new cephadmin login to perform each step. Next we need to generate an SSH key pair for the ceph admin user and copy this to the authorized-keys file on each of the ceph nodes.

Execute the following on the ceph admin node (as cephadmin):

$ ssh-keygen -t rsa

Accept the default path (/home/cephadmin/.ssh/id_rsa) and don’t set a key passphrase. You should copy the generated .ssh/id_rsa (private key) file to your admin workstation so you can use it to authenticate to the ceph servers.

Next, enable password logins (temporarily) on the storage nodes (ceph01,2 & 3) by running the following on each node:

$ sudo sed -i "s/.*PasswordAuthentication.*/PasswordAuthentication yes/g" /etc/ssh/sshd_config
$ sudo systemctl restart sshd

Now copy the cephadmin public key to each of the other ceph nodes by running the following (again only on the admin node):

$ ssh-keyscan -t rsa ceph01 >> ~/.ssh/known_hosts
$ ssh-keyscan -t rsa ceph02 >> ~/.ssh/known_hosts
$ ssh-keyscan -t rsa ceph03 >> ~/.ssh/known_hosts
$ ssh-copy-id cephadmin@ceph01
$ ssh-copy-id cephadmin@ceph02
$ ssh-copy-id cephadmin@ceph03

You should now confirm you can ssh to each storage node as the cephadmin user from the admin node without being prompted for a password:

$ ssh cephadmin@ceph01 sudo hostname
$ ssh cephadmin@ceph02 sudo hostname
$ ssh cephadmin@ceph03 sudo hostname

If everything is working correctly then each command will return the appropriate hostname for each storage node without any password prompts.

Optional: It is now safe to re-disable password authentication on the ceph servers if required (since public key authentication will be used from now on) by:

$ sudo sed -i "s/.*PasswordAuthentication.*/PasswordAuthentication no/g" /etc/ssh/sshd_config
$ sudo systemctl restart sshd

You’ll need to resolve any authentication issues before proceeding as the ceph-deploy script relies on being able to obtain sudo-level remote access to all of the storage nodes to install ceph successfully.

You should also at this stage confirm that you have time synchronised to an external source on each ceph node so that the server clocks agree, by default on Ubuntu 16.04 timesyncd is configured automatically so nothing needs to be done here in our case. You can check this on Ubuntu 16.04 by running timedatectl:

Checking time/date settings using timedatectl

For some Linux distributions you may need to create firewall rules at this stage for ceph to function, generally port 6789/tcp (for mon) and the range 6800 to 7300 tcp (for OSD communication) need to be open between the cluster nodes. The default firewall settings in Ubuntu 16.04 allow all network traffic so this is not required (however, do not use this in a production environment without configuring appropriate firewalling).

Ceph Installation

On all nodes and signed-in as the cephadmin user (important!)
Add the release key:

$ wget -q -O- '' | sudo apt-key add -

Add ceph packages to your repository:

$ echo deb $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list

On the admin node only, update and install ceph-deploy:

$ sudo apt update; sudo apt install ceph-deploy -y

On all nodes, update and install ceph-common:

$ sudo apt update; sudo apt install ceph-common -y

Note: Installing ceph-common on the storage nodes isn’t strictly required as the ceph-deploy script can do this during cluster initiation, but pre-installing it in this way pulls in several dependencies (e.g. python v2 and associated modules) which can prevent ceph-deploy from running if not present so it is easier to do this way.

Next again working on the admin node logged in as cephadmin, make a directory to store the ceph cluster configuration files and change to that directory. Note that ceph-deploy will use and write files to the current directory so make sure you are in this folder whenever making changes to the ceph configuration.

$ sudo apt install ceph-deploy -y

Now we can create the initial ceph cluster from the admin node, use ceph-deploy with the ‘new’ switch and supply the monitor nodes (in our case all 3 nodes will be both monitors and OSD nodes). Make sure you do NOT use sudo for this command and only run on the admin node:

$ ceph-deploy new ceph01 ceph02 ceph03

If everything has run correctly you’ll see output similar to the following:

ceph-deploy output

Checking the contents of the ~/mycluster/ folder should show the cluster configuration files have been added:

$ ls -al ~/mycluster
total 24
drwxrwxr-x 2 cephadmin cephadmin 4096 Jan 25 01:03 .
drwxr-xr-x 5 cephadmin cephadmin 4096 Jan 25 00:57 ..
-rw-rw-r-- 1 cephadmin cephadmin  247 Jan 25 01:03 ceph.conf
-rw-rw-r-- 1 cephadmin cephadmin 7468 Jan 25 01:03 ceph-deploy-ceph.log
-rw------- 1 cephadmin cephadmin   73 Jan 25 01:03 ceph.mon.keyring

The ceph.conf file will look something like this:

$ cat ~/mycluster/ceph.conf
fsid = 98ca274e-f79b-4092-898a-c12f4ed04544
mon_initial_members = ceph01, ceph02, ceph03
mon_host =,,
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

Run the ceph installation for the nodes (again from the admin node only):

$ ceph-deploy install ceph01 ceph02 ceph03

This will run through the installation of ceph and pre-requisite packages on each node, you can check the ceph-deploy-ceph.log file after deployment for any issues or errors.

Ceph Configuration

Once you’ve successfully installed ceph on each node, use the following (again from only the admin node) to deploy the initial ceph monitor services:

$ ceph-deploy mon create-initial

If all goes well you’ll get some messages at the completion of this process showing the keyring files being stored in your ‘mycluster’ folder, you can check these exist:

$ ls -al ~/mycluster
total 168
drwxrwxr-x 2 cephadmin cephadmin   4096 Jan 25 01:17 .
drwxr-xr-x 5 cephadmin cephadmin   4096 Jan 25 00:57 ..
-rw------- 1 cephadmin cephadmin    113 Jan 25 01:17 ceph.bootstrap-mds.keyring
-rw------- 1 cephadmin cephadmin    113 Jan 25 01:17 ceph.bootstrap-mgr.keyring
-rw------- 1 cephadmin cephadmin    113 Jan 25 01:17 ceph.bootstrap-osd.keyring
-rw------- 1 cephadmin cephadmin    113 Jan 25 01:17 ceph.bootstrap-rgw.keyring
-rw------- 1 cephadmin cephadmin    151 Jan 25 01:17 ceph.client.admin.keyring
-rw-rw-r-- 1 cephadmin cephadmin    247 Jan 25 01:03 ceph.conf
-rw-rw-r-- 1 cephadmin cephadmin 128136 Jan 25 01:17 ceph-deploy-ceph.log
-rw------- 1 cephadmin cephadmin     73 Jan 25 01:03 ceph.mon.keyring

To avoid having to specify the monitor node address and ceph.client.admin.keyring path in every command, we can now deploy these to each node so they are available automatically. Again working from the ‘mycluster’ folder on the admin node:

$ ceph-deploy admin cephadmin ceph01 ceph02 ceph03

This should give the following:


Next we need to deploy the manager (‘mgr’) service to the OSD nodes, again working from the ‘mycluster’ folder on the admin node:

$ ceph-deploy mgr create ceph01 ceph02 ceph03

At this stage we can check that all of the mon and mgr services are started and ok by running (on the admin node):

$ sudo ceph -s
    id:     98ca274e-f79b-4092-898a-c12f4ed04544
    health: HEALTH_OK

    mon: 3 daemons, quorum ceph01,ceph02,ceph03
    mgr: ceph01(active), standbys: ceph02, ceph03
    osd: 0 osds: 0 up, 0 in

    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   0 B used, 0 B / 0 B avail

As you can see, the manager (‘mgr’) service is installed on all 3 nodes but only active on the first and in standby mode on the other 2 – this is normal and correct. The monitor (‘mon’) service is running on all of the storage nodes.

Next we can configure the disks attached to our storage nodes for use by ceph. Ensure that you know and use the correct identifier for your disk devices (in this case, we are using the 2nd SCSI disk attached to the storage node VMs which is at /dev/sdb so that’s what we’ll use in the commands below). As before, run the following only on the admin node:

$ ceph-deploy osd create --data /dev/sdb ceph01
$ ceph-deploy osd create --data /dev/sdb ceph02
$ ceph-deploy osd create --data /dev/sdb ceph03

For each command the last line of the logs shown when run should be similar to ‘Host ceph01 is now ready for osd use.’

We can now check the overall cluster health with:

$ ssh ceph01 sudo ceph health
$ ssh ceph01 sudo ceph -s
    id:     98ca274e-f79b-4092-898a-c12f4ed04544
    health: HEALTH_OK

    mon: 3 daemons, quorum ceph01,ceph02,ceph03
    mgr: ceph01(active), standbys: ceph02, ceph03
    osd: 3 osds: 3 up, 3 in

    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   3.0 GiB used, 147 GiB / 150 GiB avail

As you can see, the 3 x 50GB disks have now been added and the total (150 GiB) capacity is available under the data: section.

Now we need to create a ceph storage pool ready for Kubernetes to consume from – the default name of this pool is ‘rbd’ (if not specified), but it is strongly recommended to name it differently from the default when using for k8s so I’ve created a storage pool called ‘kube’ in this example (again running from the mycluster folder on the admin node):

$ sudo ceph osd pool create kube 30 30
pool 'kube' created

The two ’30’s are important – you should review the ceph documentation here for Pool, PG and CRUSH configuration to establish values for PG and PGP appropriate to your environment.

We now associated this pool with the rbd (RADOS block device) application so it is available to be used as a RADOS block device:

$ sudo ceph osd pool application enable kube rbd
enabled application 'rbd' on pool 'kube'

Testing Ceph Storage

The easiest way to test our ceph cluster is working correctly and can provide storage is to attempt creating and using a new RADOS Block Device (rbd) volume from our admin node.

Before this will work we need to tune the rbd features map by editing ceph.conf on our client to disable rbd features that aren’t available in our Linux kernel (on admin/client node):

$ echo "rbd_default_features = 7" | sudo tee -a /etc/ceph/ceph.conf
rbd_default_features = 7

Now we can test creating a volume:

$ sudo rbd create --size 1G kube/testvol01

Confirm that the volume exists:

$ sudo rbd ls kube

Get information on our volume:

$ sudo rbd info kube/testvol01
rbd image 'testvol01':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        id: 10e96b8b4567
        block_name_prefix: rbd_data.10e96b8b4567
        format: 2
        features: layering, exclusive-lock
        create_timestamp: Sun Jan 27 08:50:45 2019

Map the volume to our admin host (which creates the block device /dev/rbd0):

$ sudo rbd map kube/testvol01

Now we can create a temporary mount folder, make a filesystem on our volume and mount it to our temporary mount:

$ sudo mkdir /testmnt
$ sudo mkfs.xfs /dev/rbd0
meta-data=/dev/rbd0              isize=512    agcount=9, agsize=31744 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
$ sudo mount /dev/rbd0 /testmnt
$ df -vh
Filesystem      Size  Used Avail Use% Mounted on
udev            1.9G     0  1.9G   0% /dev
tmpfs           395M  5.7M  389M   2% /run
/dev/sda1       9.6G  2.2G  7.4G  24% /
tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/sda15      105M  3.4M  101M   4% /boot/efi
tmpfs           395M     0  395M   0% /run/user/1001
/dev/rbd0      1014M   34M  981M   4% /testmnt

We can see our volume has been mounted successfully and can now be used as any other disk.

To tidy up and remove our test volume:

$ sudo umount /dev/rbd0
$ sudo rbd unmap kube/testvol01
$ sudo rbd remove kube/testvol01
Removing image: 100% complete...done.
$ sudo rmdir /testmnt

Kubernetes CSE Cluster

Using VMware Container Service Extension (CSE) makes it easy to deploy and configure a base Kubernetes cluster into our vCloud Director platform. I previously wrote a post here with a step-by-step guide to using CSE.

First we need an ssh key pair to provide to the CSE nodes as they are deployed to allow us to access them. You could re-use the cephadmin key-pair created in the previous section, or generate a new set. As I’m using Windows as my client OS I used the puttygen utility included in the PuTTY package to generate a new keypair and save them to a .ssh directory in my home folder.

Important Note: Check your public key file in a text editor prior to deploying the cluster, if it looks like this:

This image has an empty alt attribute; its file name is image-12.png
Public key as generated by PuTTYGen (incorrect)

You will need to change it to be all on one line starting ‘ssh-rsa’ and with none of the extra text as follows:

This image has an empty alt attribute; its file name is image-11.png

If you do not make this change this you won’t be able to authenticate to your cluster nodes once deployed.

Next we login to vCD using the vcd-cli (see my post linked above if you need to install/configure vcd-cli and the CSE extension):

Logging in to vcd-cli

Now we can see what virtual Datacenters (VDCs) are available to us:

Showing available VDCs

If we had multiple VDCs available, we need to select which one is ‘in_use’ (active) for deployment of our cluster using ‘vcd vdc use “<VDC Name>”‘. In this case we only have a single VDC and it’s already active/in use.

We can get the information of our VDC which will help us fill out the required properties when creating our k8s cluster:

VDC Properties returned by vdc info

We will be using the ‘Tyrell Servers A03’ network (where our ceph cluster exists) and the ‘A03 VSAN Performance’ storage profile for our cluster.

To get the options available when creating a cluster we can see the cluster creation help:

CSE cluster create options

Now we can go ahead and create out Kubernetes cluster with CSE:

Looking in vCloud Director we can see the new vApp and VMs deployed:

We obtain the kubectl config of our cluster and store this for later use (make the .kube folder first if it doesn’t already exist):

C:\Users\admin>vcd cse cluster config k8sceph > .kube\config

And get the details of our k8s nodes from vcd-cli:

Next we need to update and install the ceph client on each cluster node – run the following on each node (including the master). To do this we can connect via ssh as root using the key pair we specified when creating the cluster.

# wget -q -O- '' | sudo apt-key add -
# echo deb $(lsb_release -sc) main > /etc/apt/sources.list.d/ceph.list
# apt-get update
# apt-get install --install-recommends linux-generic-hwe-16.04 -y
# apt-get install ceph-common -y
# reboot

You should now be able to connect from an admin workstation and get the nodes in the kubernetes cluster from kubectl (if you do not already have kubectl installed on your admin workstation, see here for instructions).

Note: if you expand the CSE cluster at any point (add nodes), you will need to repeat this series of commands on each new node in order for it to be able to mount rbd volumes from the ceph cluster.

You should also be able to verify that the core kubernetes services are running in your cluster:

The ceph configuration files from the ceph cluster nodes need to be added to all nodes in the kubernetes cluster. Depending on which ssh keys you have configured for access, you may be able to do this directly from the ceph admin node as follows:

$ sudo scp /etc/ceph/ceph.* root@
$ sudo scp /etc/ceph/ceph.* root@
$ sudo scp /etc/ceph/ceph.* root@
$ sudo scp /etc/ceph/ceph.* root@

If not, manually copy the /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring files to each of the kubernetes nodes using copy/paste or scp from your admin workstation (copy the files from the ceph admin node to ensure that the rbd_default_features line is included).

To confirm everything is configured correctly, we should now be able to create and mount a test rbd volume on any of the kubernetes nodes as we did for the ceph admin node previously:

root@mstr-x4nb:~# rbd create --size 1G kube/testvol02
root@mstr-x4nb:~# rbd ls kube
root@mstr-x4nb:~# rbd info kube/testvol02
rbd image 'testvol02':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        id: 10f36b8b4567
        block_name_prefix: rbd_data.10f36b8b4567
        format: 2
        features: layering, exclusive-lock
        create_timestamp: Sun Jan 27 21:56:59 2019
root@mstr-x4nb:~# rbd map kube/testvol02
root@mstr-x4nb:~# mkdir /testmnt
root@mstr-x4nb:~# mkfs.xfs /dev/rbd0
meta-data=/dev/rbd0              isize=512    agcount=9, agsize=31744 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@mstr-x4nb:~# mount /dev/rbd0 /testmnt
root@mstr-x4nb:~# df -vh
Filesystem      Size  Used Avail Use% Mounted on
udev            1.9G     0  1.9G   0% /dev
tmpfs           395M  5.7M  389M   2% /run
/dev/sda1       9.6G  4.0G  5.6G  42% /
tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/sda15      105M  3.4M  101M   4% /boot/efi
tmpfs           395M     0  395M   0% /run/user/0
/dev/rbd0      1014M   34M  981M   4% /testmnt
root@mstr-x4nb:~# umount /testmnt
root@mstr-x4nb:~# rbd unmap kube/testvol02
root@mstr-x4nb:~# rmdir /testmnt/
root@mstr-x4nb:~# rbd remove kube/testvol02
Removing image: 100% complete...done.

Note: If the rbd map command hangs you may still be running the stock Linux kernel on the kubernetes nodes – make sure you have restarted them.

Now we have a functional ceph storage cluster capable of serving block storage devices over the network, and a Kubernetes cluster configured able to mount rbd devices and use these. In the next section we will configure kubernetes and ceph together with the rbd-provisioner container to enable dynamic persistent storage for pods deployed into our infrastructure.

Putting it all together

Kubernetes secrets

We need to first tell Kubernetes account information to be used to connect to the ceph cluster, to do this we create a ‘secret’ for the ceph admin user, and also create a client user to be used by k8s provisioning. Working on the kubernetes master node is easiest for this as it has ceph and kubectl already configured from our previous steps:

# ceph auth get-key client.admin

This will return a key like ‘AQCLY0pcFXBYIxAAhmTCXWwfSIZxJ3WhHnqK/w==’ which is used in the next command (Note: the ‘=’ sign between –from-literal and key is not a typo – it actually needs to be like this).

# kubectl create secret generic ceph-secret --type="" \
--from-literal=key='AQCLY0pcFXBYIxAAhmTCXWwfSIZxJ3WhHnqK/w==' --namespace=kube-system
secret "ceph-secret" created

We can now create a new ceph user ‘kube’ and register the secret from this user in kubernetes as ‘ceph-secret-kube’:

# ceph auth get-or-create client.kube mon 'allow r' osd 'allow rwx pool=kube'
        key = AQDqZU5c0ahCOBAA7oe+pmoLIXV/8OkX7cNBlw==
# kubectl create secret generic ceph-secret-kube --type="" \
--from-literal=key='AQDqZU5c0ahCOBAA7oe+pmoLIXV/8OkX7cNBlw==' --namespace=kube-system
secret "ceph-secret-kube" created


Kubernetes is in the process of moving storage provisioners (such as the rbd one we will be using) out of its main packages and into separate projects and packages. There’s also an issue that the kubernetes-controller-manager container no longer has access to an ‘rbd’ binary in order to be able to connect to a ceph cluster directly. We therefore need to deploy a small ‘rbd-provisioner’ to act as the go-between from the kubernetes cluster to the ceph storage cluster. This project is available under this link and the steps below show how to obtain get a kubernetes pod running the rbd-provisioner service up and running (again working from the k8s cluster ‘master’ node):

# git clone
Cloning into 'external-storage'...
remote: Enumerating objects: 2, done.
remote: Counting objects: 100% (2/2), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 63661 (delta 0), reused 1 (delta 0), pack-reused 63659
Receiving objects: 100% (63661/63661), 113.96 MiB | 8.97 MiB/s, done.
Resolving deltas: 100% (29075/29075), done.
Checking connectivity... done.
# cd external-storage/ceph/rbd/deploy
# sed -r -i "s/namespace: [^ ]+/namespace: kube-system/g" ./rbac/clusterrolebinding.yaml ./rbac/rolebinding.yaml
# kubectl -n kube-system apply -f ./rbac "rbd-provisioner" created "rbd-provisioner" created
deployment.extensions "rbd-provisioner" created "rbd-provisioner" created "rbd-provisioner" created
serviceaccount "rbd-provisioner" created
# cd

You should now be able to see the ‘rbd-provisioner’ container starting and then running in kubernetes:

Testing it out

Now we can create our kubernetes Storageclass using this storage ready for a pod to make a persistent volume claim (PVC) against. Create the following as a new file (I’ve named mine ‘rbd-storageclass.yaml’). Change the ‘monitors’ line to reflect the IP addresses of the ‘mon’ nodes in your ceph cluster (in our case these are on the ceph01, ceph02 and ceph03 nodes on the IP addresses shown in the file).

kind: StorageClass
  name: rbd
  adminId: admin
  adminSecretName: ceph-secret
  adminSecretNamespace: kube-system
  pool: kube
  userId: kube
  userSecretName: ceph-secret-kube
  userSecretNamespace: kube-system
  imageFormat: "2"
  imageFeatures: layering

You can then add this StorageClass to kubernetes using:

# kubectl create -f ./rbd-storageclass.yaml "rbd" created

Next we can create a test PVC and make sure that storage is created in our ceph cluster and assigned to the pod. Create a new file ‘pvc-test.yaml’ as:

kind: PersistentVolumeClaim
apiVersion: v1
  name: testclaim
    - ReadWriteOnce
      storage: 1Gi
  storageClassName: rbd

We can now submit the PVC to kubernetes and check it has been successfully created:

# kubectl create -f ./pvc-test.yaml
persistentvolumeclaim "testclaim" created
# kubectl get pvc testclaim
NAME        STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
testclaim   Bound     pvc-1e9bdbfd-22a8-11e9-ba77-005056340036   1Gi        RWO            rbd            21s
# kubectl describe pvc testclaim
Name:          testclaim
Namespace:     default
StorageClass:  rbd
Status:        Bound
Volume:        pvc-1e9bdbfd-22a8-11e9-ba77-005056340036
Labels:        <none>
Finalizers:    []
Capacity:      1Gi
Access Modes:  RWO
  Type    Reason                 Age   From                                                                               Message
  ----    ------                 ----  ----                                                                               -------
  Normal  ExternalProvisioning   3m    persistentvolume-controller                                                        waiting for a volume to be created, either by external provisioner "" or manually created by system administrator
  Normal  Provisioning           3m  External provisioner is provisioning volume for claim "default/testclaim"
  Normal  ProvisioningSucceeded  3m  Successfully provisioned volume pvc-1e9bdbfd-22a8-11e9-ba77-005056340036
# rbd list kube
# rbd info kube/kubernetes-dynamic-pvc-25e94cb6-22a8-11e9-aa61-7620ed8d4293
rbd image 'kubernetes-dynamic-pvc-25e94cb6-22a8-11e9-aa61-7620ed8d4293':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        id: 11616b8b4567
        block_name_prefix: rbd_data.11616b8b4567
        format: 2
        features: layering
        create_timestamp: Mon Jan 28 02:55:19 2019

As we can see, our test claim has successfully requested and bound a persistent storage volume from the ceph cluster.


VMware Container Service Extension 
VMware vCloud Director for Service Providers 

Wow, this post ended up way longer than I was anticipating when I started writing it. Hopefully there’s something useful for you in amongst all of that.

I’d like to thank members of the vExpert community for their encouragement and advice in getting this post written up and as always, if you have any feedback please leave a comment.

Time-permitting, there will be a followup to this post which details how to deploy containers to this platform using the persistent storage made available, both directly in Kubernetes and using Helm charts. I’d also like to cover some of the more advanced issues using persistent storage in containers raises – in particular backup/recovery and replication/high availability of data stored in this manner.


Installing and Using the vRealize Orchestrator (vRO) CLI

Something I was not aware of until recently was that vRealize Orchestrator (vRO) has it’s own Command Line Interface (CLI) environment. This can be an invaluable tool when developing new vRO workflows and actions as it allows you to easily test expressions and code snippets in your environment. Once developed these scripts and actions can be reused easily from vRO workflows in the vRO Workflow Designer.

Downloading vRO-CLI

Unfortunately vRO-CLI is not particularly well publicised (originating as a VMware ‘Fling’) and does not have official support, but it can still be a valuable tool. Due to the support status though I would only recommend installing and using this in a Test/Dev environment rather than in Production.

Locating the tool is currently a bit problematic, a google search for ‘vRO CLI’ will usually take you to this page:


Unfortunately this is an old version (from September 2015) and isn’t compatible with the latest (7.x) releases of vRO. To get to the latest version you will need to visit this page in the VMware community forums: which has the build 4693774 downloads which work with the latest vRO versions:


You will need to download at least 2 of the .zip files, the vRO plugin itself (ending in and the client application for whichever OS you will be using on your development machine (Linux or Windows).

Installing vRO-CLI

Installation is in 2 parts, the first installs the plugin into your vRO appliance to provide the endpoint for the vRO-CLI. The second installs the vRO-CLI client on your workstation to be able to use the service.

1) Installing the vRO-CLI plugin on the vRO appliance

Once you have downloaded both packages, the first step is to install the vRO-CLI plugin into your vRO instance.

Open the web page for your vRO appliance and select the ‘Open Control Center’ link:


Once signed in to the Control Center, select the ‘Manage Plug-Ins’ icon:


Unzip the file with the extension you previously downloaded (e.g. and use the Browse button to select the extracted file:


Click the ‘Upload’ button and when prompted accept the EULA agreement and click install:


You will get the following if everything has been successfully configured:


You need to let the Orchestrator service restart (just leave the configuration appliance and wait a couple of minutes) before the service will be available for client connections.

2) Installing the vRO-CLI Client

Since I’m using a Windows workstation for administration the following details the setup for a Windows vRO-CLI client. You will need to have Java installed and configured in your Windows OS prior to being able to run the vRO-CLI client.

Simply extract the .zip file (in this example, to your machine, I used ‘C:\Program FIles\vRO-CLI-2.0.0’ as the destination directory:


You can now start the vRO-CLI client using either the GUI or command-line. To start using the GUI, start the vcocli-gui.bat script from the o11nplugin-vcocli-dist-2.0.0\bin folder, you should see the vCO CLI login dialog:


Use the hostname of your vRO appliance as the ‘vCO address’ (without any port specification) and supply valid vRO user name and password.

Note that if you have a firewall or other network security between your workstation and the vRO server you will need to permit tcp port 8265 between them to allow connection.

The ‘Session name’ can be anything you like and can be used to reconnect to sessions which have already been started and suspended. Then click ‘New’ to create a new session, if everything has gone well you should see the initial vRO-CLI screen:


Also note that the ‘Quit Session’ button will terminate your current session and you will not be able to reconnect to it, but closing the window using the close (X) icon in the top-left will keep the session running and allow re-attachment to the same session later.

To use the CLI version, you can start the vcocli.bat file from a Windows command prompt and specify the –vco and –username switches to specify the vRO server:


If you want to see what sessions already exist and can be reconnected, you can open vRO client and browse under the ‘vCO CLI / Start Session’ in the tree, each running ‘Start Session’ token will show in the ‘Variables’ tab the session name for sessions which can be reconnected (using ‘Attach’ in the GUI or the –resume switch on the command line):


Using the vRO-CLI

Once you have the client installed and connecting successfully to your vRO server, how do you actually use it? One of the early challenges I faced was working out how to actually reference an object in the vRO object browser in my script fragments. The ‘Help’ documentation provides some useful content, but doesn’t address how to obtain object references like this.

In this example, let’s assume that we’re trying to write a script to perform an action on the ‘test03’ VM in our environment owned by the vCD tenant ‘Tyrell’ in the VDC ‘Tyrell A03 Allocated’ and in the vApp ‘test03’. We can browse down the objects to locate this VM in the vRO-CLI UI:


Now the Server.findForType function can be used when supplied with an object type and the dunesId reference for our object:

var myTest03VM = Server.findForType('vCloud:VM','602d4ed77a4d20e9f854214a808ffcc2e878185e973d38446ad2bac2a623081////https://<my vCD server>/api/vApp/vm-98ae8d30-1090-4958-9061-f1a86590dc7b');

You can copy the value of the object by highlighting the ‘dunesId’ line in the object properties window and copy (Ctrl + C) and pasting (Ctrl + V) this into your command. (Note that this will also include the ‘dunesId’ text which will need to be removed).

This allows us to extract the output of the ‘toXml()’ method for our VM as follows:


All of the usual vRO methods and functions for the object are also available to us. If you need to initiate existing vRO workflows or actions the online help has details on how this can be initiated too.

Note: If you don’t highlight any text in the upper Input area, the entire script will be executed, but if you highlight a block of code or a single line, only that code will be executed when you click the ‘Execute’ icon (or press F5) which can be extremely useful to try out fragments of code and check the results are as expected.

Hopefully this will be useful to those of you developing workflows and actions in vRO and provide you with another method to write and debug your actions.

As always, comments and feedback are always welcome.


First thoughts on AWS Re:Invent 2018 and where VMware is missing the point

Firstly I’ll start this off by saying that I don’t usually write opinion pieces – I’d much rather share some cool technology or tips that I’ve come across in my day job (or even playing with technology in my own time). Secondly, my perspective is likely a bit skewed – I work in one of the most virtualized and cloud-adopting countries in the world (New Zealand) where the vast majority of customers I speak to have no on-premises server environment any more or are planning to retire the ones they have. These customers have either shifted entirely to public cloud, or are using the services of providers such as my employer to provide local private cloud platforms for them.

In particular in Christchurch where I’m based we had a series of devastating earthquakes in 2010 and 2011 which accelerated this shift – many customers simply felt no need to rebuild datacenters when rebuilding their offices given available local provider and public cloud options open to them. Having a high-speed urban fibre network across the entire country has almost certainly helped to accelerate this trend.

For those not familiar, hosting providers such as ourselves who run a largely VMware software stack use the usual components (vSphere, vCenter, ESXi, NSX networking and sometimes VSAN storage) as the foundation of our platforms, but VMware provide an additional software layer ‘vCloud Director’ (vCD) which sits on top of all of these to provide a secure multi-tenant platform. It also provides a feature-rich public-facing portal and API to allow orchestration and automation as well as being extensible by a plugin architecture so that Operations Management, Backup/Recovery, Replication, Container hosting (to name 4) and other services can be easily integrated and published to our customers. The way this architecture has been implemented also makes it reasonably easy for 3rd parties to write their own vCD extensions and have these seamlessly published into the same environment.

Recent improvements in vCD 9.5 have included the move to a native HTML5 portal, the addition of many more customization and configuration options, support for multi-site deployments where customers consume resources in multiple physical locations, as well as new networking functionality allowing seamless networking across multi-site deployments. In addition, the new tenant portals for vCloud Availability Cloud to Cloud DR and vRealize Operations allow customers to completely manage their DR replication and failover as well as gain operational insights and management of their deployed workloads. This blog post covers the most recently enabled functionality in vCloud Director for those that wish to find more information.

Many of our customers are also leveraging public cloud infrastructure platforms for a variety of reasons including advanced features, hyper-scale elasticity, ease of operations and management and (often) to decrease their overall IT infrastructure spend. Often overlooked though is that the main driver for many customers is to have their internal IT teams concentrating on business applications and data – looking for ways to add business value to their organizations rather than in ‘feeding and watering’ infrastructure in their own datacenters. In fact many of the reasons customers chose a provider such as ourselves are very similar. The determining factors are often older applications that can’t survive at the reasonably significant latencies which are inevitable from New Zealand to our closest public cloud platforms in Australia, and some concerns around data sovereignty (although these are largely diminishing).

Of the AWS announcements made this week at their annual Re:Invent , AWS Outposts is a fascinating platform proposition which, if done well, will be a great option for a local AWS consistent platform in local (NZ) datacenters. I’m also  impressed with the announcements for new and enhanced AWS services – S3 Glacier Deep Archive could well spell the ‘final’ end of tape as a data archival technology for example. AWS Control Tower as a simplified way to easily deploy a landing zone into AWS is another service which I think will resonate well with many of our customers who find it challenging to deploy their initial AWS footprints with appropriate security, controls and governance. AWS TimeStream finally provides a ‘proper’ way of dealing with time-series data in huge volumes without trying to squish it into a relational database with all the issues that creates. Perhaps the most interesting is AWS RDS on VMware which was announced back in August at VMworld and allows AWS RDS services to run in a vSphere environment in a local datacenter with support for data replication and DR.

So with all that said, why do I think VMware is missing the point? – after all there are some great technologies and services being made available from both vendors.

Let’s take a look at some new/recent VMware products and services and see:

1) VMware Cloud Assembly

A great new technology to allow easy construction of templates and blueprints to speed deployment of application environments to multiple cloud endpoints. Supports all the major public cloud endpoints (AWS, Azure, GCP) as well as vCenter as the deployment endpoint.

2) VMware HCX (Hybrid Cloud Extension)

Awesome technology which allows live-migration of running business applications between vSphere sites (and even between vSphere on-premises and VMware cloud on AWS).

3) AWS RDS on vSphere

Mentioned above, but provides capability to run AWS consistent database services from all major RDS providers (MariaDB, MySQL, PostgreSQL, Oracle and SQL Server) in an on-premises vSphere environment. Can even allow these databases to span both an on-premises and AWS environment to provide scale, high availability and DR options.

4) VMware Cloud on AWS

Fantastic option that gives customers the option of deploying vSphere environments directly into public cloud and run their VMs with no changes whatsoever in that platform. Also with HCX (above) can live-migrate workloads in and out of public cloud. Provides consistent management, operations and security options across both platforms.

What do all of these have in common? You’ve probably guessed it – not a single one of them works with vCloud Director. If a customer wants to use HCX (for example) to seamlessly move workloads from a VCPP provider to AWS and back – not supported. Deploy Cloud Assembly blueprints to their VCPP provider? Nope, doesn’t work either, no support for the vCD API as an endpoint. Allow them to use AWS RDS services alongside their VCPP provider hosted VMs? No there also.

Basically it comes down to this: If you build a tool or service that only talks to vCenter (or vCenter APIs) and not to vCloud Director, you are missing out on making your products and services available to a large number (around 4,200 I believe right now) of VMware Cloud Provider Partners (VCPP) such as ourselves that offer vCloud Director as the primary interface and API for customers to manage their workloads. What’s more, from figures mentioned by VMware themselves, the number of workloads in VCPP provider datacenters managed through vCD is increasing massively ahead of vSphere and vCenter on-premises solutions.

One of the likely comments I’ll get to this post is ‘Well, you could just provided dedicated vSphere environments for each customer that needs these functions’. This is accurate – we definitely could do this, but the overhead of managing and maintaining a large number of discreet vSphere instances (including all of the management and operations tooling that these require) doesn’t scale well and would result in a huge amount of extra work. In addition, because we can’t securely multi-tenant vSphere environments there would be a huge amount of wasted capacity on hosts which aren’t heavily loaded or only exist to provide cluster hardware redundancy. This would make the solutions incredibly expensive by comparison to a true multi-tenanted platform.

So… if anyone from VMware is still reading by now… you’ve given your VCPP partners and providers an awesome platform in vCloud Director to allow a true multi-tenanted cloud platform, uptake and usage of this platform is in massive growth right now. Now please make sure the rest of your technologies can work with it.

As always, comments and feedback appreciated, how do other VCPP providers feel about this?



A few days after I first posted this, I saw this tweet from Steve Dockar pop up in my news feed linking to this youtube video which shows a preview version of Cloud Assembly using vCloud Director as the endpoint. This is awesome, and something which the VMware people I spoke to on the show floor at re:Invent obviously knew nothing about.

vCloud Director 9 HTML5 Portal Customization

One of the great features in vCloud Director 9 which has been further enhanced in the latest v9.5 release is the new HTML5 portal:


Even better, VMware has released a toolkit to allow Service Providers to fully customise the look and feel of the portal using CSS themes in their Clarity framework..

The toolkit itself is part of the VMware vcd-ext-sdk repository on github, available in the /ui/theme-generator folder.

The repository has good instructions on how to modify and build a custom theme, but actually uploading and configuring the theme in vCloud Director is only accessible via the vCD API and involves a reasonable amount of manual work.

To help speed up development and allow changes to be easily tested, in my usual mode I’ve written a small PowerShell module that allows quicker/easier theme configuration. The module is available on github at Hopefully this will help those of you who need to develop and test updated themes for your vCloud Director portals.

I’ve included documentation in the repository on each cmdlet, its function and arguments here.

To use the module you’ll need to be connected to a vCloud instance as a user with global ‘Administrator’ access in the ‘System’ organization since changes will affect all portal users. You’ll need to be connected to the vCD environment with PowerCLI (Connect-CIServer…) prior to using the cmdlets.

You can then download the vcd-ht-themes.psm1 file and add it to your session (‘Import-Module vcd-h5-themes.psm1’) to access the cmdlets.

As always, comments and feedback welcome – is there anything else you’d like to see added to this module?


VM Guest Customization in vCloud Director via PowerCLI

Bit of a quick post this, but hopefully useful to others.

I got asked recently if there was an easy way to set Guest Customization options for VMs hosted in vCloud Director via Powershell/PowerCLI. It turns out there is an extremely simple way, but the syntax is a bit awkward so figured it would make a good/quick blog post.

The Guest Customization settings are available as one of the ‘Section’ entries returned by accessing the ExtensionData properties on a CIVM object. Once connected (Connect-CIServer) you can see this from PowerCLI:


The ‘trick’ is that there are typically 5 sections (one each for OvfVSSD, OvfMsg, network connections, guest Customization and VMware tools). I’ve seen some approaches that rely on the ‘guest Customization’ setting always being found at the Section[3] index in the ExtensionData collection, but this could easily change in future and break any functionality relying on this. A much more reliable way of finding the guest Customization section values is:


But how about if you need to change/update a setting, luckily there is a method provided (UpdateServerData) which does exactly this. So if we want to (for example) change the ‘CustomizationScript’ setting to “echo “Hello World!” we can:


You can change other settings using the same method (e.g. ComputerName or Domain join settings).

Note that for many changes the VM must be powered off, and you may need to ‘Power On and Force Recustomization’ too.

As always, comments & feedback appreciated.


Getting detailed VM Disk Properties from the vCloud API

Since vCloud Director 8.10 VMware have allowed VMs to be created which have multiple disks using different storage policies. This can be very useful – for example, a database VM might have it’s database on fast storage but another disk containing backups or logs on slower/cheaper disk.

When trying to find out what storage is in use for a VM though this can create issues, the PowerCLI Get-CIVM cmdlet (and the Get-CIView cmdlet used to get extra information) aren’t able to properly report storage for VMs that consume multiple storage policies. This in turn can create problems for Service Providers when they need to report on overall VM disk usage divided by storage policy used.

As an example I’ve created a VM named ‘test01’ in a customer vDC which has 3 disks attached, the 2nd of these is on ‘Capacity’ tier storage while disks 1 and 3 are on ‘Performance’ storage. When we look at the VM details we see the following:


Digging into the ExtensionData shows


The StorageProfile element looks like it may contain what we need, but unfortunately this only shows the ‘home’ Storage for the VM and doesn’t indicate that at least one of the VMs disks is on a different storage profile:


After a lot of mucking around trying to find an easy way to discover the information, I ‘gave up’ and wrote a PowerShell module which accesses the vCD API directly to get the VM storage information (including storage tiers in use by each disk). The module isn’t overly efficient since it queries the storage profile reference for every disk on every VM (and so will result in a lot of calls if run for a large number of VMs), but otherwise works fine.

The module takes VM objects or a VM name as input and returns details on each disk attached to the VM including which storage profile they use. Save the script (e.g. as ‘Get-CIVMStorageProfile.psm1’) and then use ‘Import-Module .\Get-CIVMStorageProfile.psm1’ to import the function.

   Gets detailed storage information from a vCloud VM.

   This function returns detailed disk information for a vCloud VM. Specifically
   it shows the number of disks attached and the storage policy assigned to each
   disk which is useful when VMs consume storage from multiple policies.

.Parameter CIVM
   The VM object (from Get-CIVM) to report storage for.

   Get-CIVM -Name 'test01' | Get-CIVMStorageDetail

Function Get-CIVMStorageDetail
     begin {}

        # API version to use when communicating with vCloud Director - API 27.0 is vCloud Director 8.20:
         $vCDAPIVersion = "27.0"
         # Check if we've been passed a VM name or an actual VM object and handle appropriately
         if ($CIVM.GetType() -eq [String]) {
             try {
                 $VMObj = Get-CIVM -Name $CIVM -ErrorAction Stop
             } catch {
                 Write-Host -ForegroundColor Red "Error: Could not find a VM with the name $CIVM."
         else {
             $VMObj = $CIVM

        # Find our vCloud SessionId that matches the URI of the VM object:
         $SessionId = $global:DefaultCIServers.SessionId | Where-Object { $VMObj.href -match $_.ServiceUri }

        try {
             $vmxml = Invoke-RestMethod -Method Get -Uri "$($VMObj.href)/virtualHardwareSection/disks" -Headers @{'x-vcloud-authorization'=$SessionId; 'Accept'="application/*+xml;version=$($vCDAPIVersion)"} -ErrorAction Stop
         } catch {
             Write-Host -ForegroundColor Red "Error attempting to get VM details from API:"
             Write-Host -ForegroundColor Red "Status Code: $($_.Exception.Response.StatusCode.value__)"
             Write-Host -ForegroundColor Red "Status Description: $($_.Exception.Response.StatusDescription)"

        # Build an empty object for the VM disk details:
         $vmdisks = @()

        # RASD resource type 17 is a hard disk attached to a VM:
         foreach($disk in ($vmxml.RasdItemsList.Item | Where-Object -Property ResourceType -eq 17)) {
             # Dereference the StorageProfileHref for each disk to get the Storage Profile Name:
             try {
                 $sprof = Invoke-RestMethod -Method Get -Uri "$($disk.HostResource.storageProfileHref)" -Headers @{'x-vcloud-authorization'=$SessionId; 'Accept'="application/*+xml;version=$($vCDAPIVersion)"} -ErrorAction Stop
             } catch {
                 Write-Host -ForegroundColor Red "Error attempting to get Storage Profile Name:"
                 Write-Host -ForegroundColor Red "Status Code: $($_.Exception.Response.StatusCode.value__)"
                 Write-Host -ForegroundColor Red "Status Description: $($_.Exception.Response.StatusDescription)"
             $diskprops = @{
                 VMName         = [string]$VMObj.Name
                 InstanceID     = [string]$disk.InstanceID
                 StorageProfile = [string]$sprof.VdcStorageProfile.Name
                 CapacityGB     = [float][math]::Round(($disk.VirtualQuantity / 1024 / 1024 / 1024),3)
                 ElementName    = [string]$disk.ElementName

            $diskobj = New-Object PSObject -Property $diskprops
             $vmdisks += $diskobj

        return $vmdisks
     } # end process

Export-ModuleMember -Function Get-CIVMStorageDetail

And here is example output from the script for our test VM:


Hope this is useful to some of you and as always, appreciate any comments/feedback.

I’d also love to know if there’s an easier way of generating this information.


Tenant Portal Displays ‘No Datacenters are available’ in vCloud Director 9.1

We had an issue recently when updating our vCloud Director environment to v9.1 where the new tenant portal would show ‘No Datacenters are available’ for every tenant even though the remainder of the site worked correctly (and other tabbed options like the Service Library & catalogs worked fine). Initially we suspected that our SSL certificate chain or public URI’s were set incorrectly.

Adrian Begg has a great blog post here: which details this issue and how to ensure the correct settings are applied, however in our case this didn’t resolve our issue.

Eventually an offhand remark in a slack channel by Tom Fojta put me on the right track to solving the issue, I’ve written this post up in case anyone else comes across the same issue. If you’re impatient and want to know the solution – it’s DNS (isn’t it always DNS?), but that’s jumping ahead a bit.

In our environment we have 3 vCloud Director cell servers behind a load balancer, we also load-balance internally so that our management environment can talk to the vCD API and we can conduct testing of the environment without necessarily having it open to the public internet. The arrangement looks logically like this:


vCloud Director Load Balancing

Users from the internet accessing ‘’ get redirected to one of the vCD cell servers (and if one of them is unavailable the monitoring on the Load Balancer doesn’t direct requests there). The same happens for internal users, but in this case the ‘’ DNS entry has been overridden to point at the internal ( address to allow connectivity to the cells even if the external LB or internet link is unavailable.

The issue in our environment was that the cell servers themselves use DNS to access the vCloud API – and they use the public URL specified in the vCloud Director configuration.

The cell servers were configured with our internal DNS servers, so when they attempted to access the public URL (‘’) were being given the internal Load Balancer address ( For reasons we’re still exploring, this didn’t allow them to get a response from the vCD API and resulted in the ‘No Datacenters are available’ error in the tenant portal.

The fix turned out to be reasonably simple – on each cell server we added an entry to the /etc/hosts file to resolve the public URL to the cell’s own IP address, so on cell 01:

On cell02:

And on cell03:

Once we’d made this change the tenant portals began functioning correctly (note that no restart of the cell servers or vCloud Director services was required).

What I assume is happening is that when the internal load balancer responds the the request it gives out a different cell server’s address (since the ‘source’ of the request will be a cell server) and that cell server has no knowledge of the session being used by the original cell and so responds incorrectly (either with nothing, or with an error). Not sure if this is actually a bug, or just something to be aware of, but either way overriding name resolution in this way fixes the issue. Note that simply using ‘localhost’ or for the hosts file entry doesn’t work since the vCloud web server doesn’t respond on the loopback interface in the default configuration.

Just posting this here in the hope it will save someone else any frustration caused by this issue.


Writing vRealize Orchestrator Workflows for vCloud Director v9.1

One of the great new features in vCloud Director 9.1 is the ability to publish Orchestrator workflows to tenants which can be consumed from within their vCloud portal. Markus Kraus has written an excellent post showing the configuration process for linking Orchestrator and vCloud Director. This post shows the process for creating and deploying a workflow into this environment and shows the user experience when invoking the finished workflow. I can see a large range of possible use-cases for this capability – hopefully this post will give you an idea of what is possible.

Our demonstration workflow will be reasonably simple and cover a scenario where a tenant consuming an Allocated VDC needs more resources assigned to it. While this could be fully automated (directly make the changes to the vCloud Director VDC), it is probably more likely in these scenarios that the service provider would want to check and implement the change themselves. Therefore the following actions are required. The workflow tasks are therefore:

  • Extract the environment details (so we know which tenant user in which vCloud Organisation initiated the request.
  • Allow the tenant to enter the parameters for the extra resources they require.
  • Send an email to the service provider that contains the request details.
  • Send an email to the tenant user confirming the request details.

This example assumes that you have a functioning vRealize Orchestrator instance in the vCloud Director environment, and that you’ve registered the vCloud Director endpoint in Orchestrator. It also requires that you’ve configured the Orchestrator integration in the vCD provider portal and granted the new Service Library permissions to the tenant Organization Administrator role.

From the Orchestrator client, we first create a new workflow in Design view (I’ve created a folder to contain this called ‘vCD Workflows’ too), I’ve called my workflow ‘Request VDC Resources’:


The new (empty) workflow is created and displayed in the vRO editor:


Our first task is to create some workflow inputs to capture required inputs, these can be created in the ‘Inputs’ tab using the ‘Add parameter’ button:


We need the following information from the user to be able to process this workflow, so I’ve created the parameters, given them the correct Type and set a description for each one. In my lab environment there are two classes of storage profiles (performance and capacity) so I’ll ask the user to provide values for both if requesting a storage increase. To change the parameter names, types and descriptions just click in the fields. The final ‘Inputs’ tab now looks like this:


Now we can alter the Presentation tab to group the input fields appropriately. It can be a bit fiddly to add groups (Orchestrator adds steps each time which need to be removed), but with a bit of fiddling about you can get to something like this:


Next we need to add some attributes to our workflow to contain the subject and content of the email message together with some parameters to be passed to the send mail workflow, this can be done on the ‘General’ tab using the ‘add attribute’ button:


So that should be all of the information we require, next step is to go to the Schema tab and add a Scriptable Task by dragging the element between our start and end markers:


Double clicking on the title ‘Scriptable task’ allows us to set a friendly script name (‘Build Resource Email’ in this example). We can then select the ‘In’ tab for the script and select our input attributes and parameters:


We select the in-parameters (‘tenantEmail’, ‘addCPU’, ‘addRAM’, ‘addStoragePerf’ and ‘addStorageCap’) from this dialog and get the following listed under our script’s ‘IN’ tab:


We do the same to bind the ‘emailSubject’ and ‘emailContent’ attributes to our script’s ‘OUT’ tab:


Switching to the ‘Visual Binding’ tab, (sliding the edit pane up a bit for clarity) should now show something like the following:


We can now switch to the Scripting tab to actually write the code that will generate our email subject and body:


The code from this script is included below in case copy/paste is useful:

// Request VDC Resources
// Build email subject and body from input parameters and vCD context

var myCX = System.getContext();
var tenantOrg = myCX.getParameter("_vcd_orgName");
var tenantOrgId = myCX.getParameter("_vcd_orgId");
var tenantUser = myCX.getParameter("_vdc_userName");

emailSubject = "Additional VDC Resource Request from " + tenantUser + " at " + tenantOrg;
emailContent = "<p>The following resource request was made by " + tenantUser + " from Organization " + tenantOrg;
emailContent += " (Org ID: " + tenantOrgId + ")</p>"
emailContent += "<table><tr><th>Item</th><th>Value</th><th>Units</th></tr>"
if (addCPU > 0) { emailContent += "<tr><td>Additional CPU Resource</td><td>" + addCPU + "</td><td>GHz</td></tr>"; }
if (addRAM > 0) { emailContent += "<tr><td>Additional RAM Resource</td><td>" + addRAM + "</td><td>GB</td></tr>"; }
if (addStoragePerf > 0) {
     emailContent += "<tr><td>Additional Performance Storage</td><td>" + addStoragePerf + "</td><td>GB</td></tr>";
if (addStorageCap > 0) {
     emailContent += "<tr><td>Additional Capacity Storage</td><td>" + addStorageCap + "</td><td>GB</td></tr>";
emailContent += "</table>"

There’s nothing too complicated going on in the script, just building some HTML strings based on the input values we’ve received from the workflow. We query the _vdc_orgName, _vdc_orgId and _vdc_userName from our script environment to retrieve these values which are provided by the vCloud Director Service Library so we know which user in which tenant organisation has initiated the workflow.

Next we need to add the ‘send notification’ workflow to actually send an email, this can be dragged from the Mail folder under ‘All Workflows’ and placed after our script in the workflow Schema:


This first ‘Send notification’ will be the email sent to the service provider so I’ve renamed it as ‘Mail Provider’ and selected the ‘Source parameter’ items on the ‘IN’ tab as follows:


(To set each source parameter simply click on the ‘not set’ value and select the appropriate workflow attribute from the pop-up, ensure that unused parameters are set as NULL).

We can add a 2nd ‘Send Notification’ task to our workflow and configure it to send the same email content back to our tenant’s email address (the only change here is that the ‘toAddress’ parameter is set to the provided workflow tenant email address rather than the provider email address):


Our workflow is now complete and should be functional, we can ‘Validate’ and then Save and Close it in the Orchestrator client.

Our next step is to publish the workflow from the the provider interface of vCloud Director at https://<my vcloud IP>/provider

One logged in we see the following:


Selecting the 3-bars (highlighted) option and selecting ‘Content Libraries’ shows the Service Library:


First we need to select ‘Service Mangement’ and then ‘Service Categories’ tab to create a new Service Category (group) into which our workflow will be imported:


Clicking the ‘+’ sign allows us to define a new category and provide an icon for it:


Once saved we can return to the ‘Service Library’:


The ‘Import’ option allows us to add our workflow to the library, first we select the category we just created:


Next we select the source Orchestrator instance where our workflow lives:


Now we can browse the workflow tree and select our new workflow:


Finally we review and select ‘Done’ to create the library entry:


Using the ‘Manage’ button we can chose who the workflow should be published to:


If we don’t select ‘Publish to All Tenants’ then we can chose individual tenancies with the check boxes:


Clicking Save completes the process and publishes our workflow.

Note: I’ve had several instances where changes to publishing are not saved correctly, I’d suggest going back into the settings and checking these after making any changes.

Now if we log in to vCD as a tenant we can see the published workflow in the ‘Libraries’ option:



Clicking ‘Execute’ initialises our workflow and requests the input parameters we defined for the workflow:


Here I’ve asked for 16GB more RAM, 10GHz more CPU, 2TB more Capacity storage and 1TB more Performance storage.

Clicking ‘Finish’ submits the request and we see in our email that they have arrived populated with the details from our workflow:


This is of course a fairly basic example, and not designed to be useful ‘as is’, but hopefully has given you a good idea of the power and flexibility that Service Libraries introduce in vCloud Director v9.1 and given you some ideas of how they can be used.

Generating emails is a trivial example, but much more complicated workflows could easily be built (for example, to directly submit API requests to other systems as well as directly provisioning resources in vCloud Director on request).

As always, comments and feedback appreciated.


Using vCloud Director PowerCLI and vcd-cli with Federated User Accounts

One of the issues that vCloud Director user can run into is user authentication when using the PowerCLI and vcd-cli tools to manage their cloud deployments. For ‘Local’ user accounts defined in the vCloud Director portal this isn’t an issue as username/password are stored in the vCD database and can be directly authenticated. However, many customers want to federate their vCloud users with an external directory service (often Microsoft AD FS or other similar service). Typically this is done so that security groups in the external directory can be used to control access levels, and so that additional authentication mechanisms like 2-Factor Authentication (2FA) can be applied to accounts.

If you attempt to use CLI tools like vcd-cli or PowerCLI to authenticate with a federated user account you will get a ‘Login Failed’ or ‘Unauthorized’ failure and won’t be able to connect to the service.

Fortunately, both vcd-cli and PowerCLI allow you to use an existing browser vCloud session ID to connect to the vCD API. To use this you connect to your vCloud portal in a web browser and then then use your browser’s tools to find the session ID for your connection. Once you have the session ID you can create a PowerCLI or vcd-cli session using that token.

It can sometimes be easier to use a browser plugin or extension to help find the session ID, ones which show session cookies and/or HTTP headers work best, but even without these it is possible.

In Google Chrome for example, use <ctrl + shift + I> (or Menu / More Tools / Developer Tools) to open the developer interface. Next click on the ‘Network’ heading at the top of the developer panel and refresh the vCloud Director portal. Scroll down to one of the ‘amfsecure’ document lines and select the ‘Headers’ tab, you should see a panel similar to this:


You can simply copy the value from the highlighted entry (87489f6a17044d66bc36704ce5c4e45c in this example) and use that to establish a vcd-cli or PowerCLI session:

For vcd-cli:

vcd login <cloud endpoint> <org name> <user name> –d <session ID string>


vcd login myorg joebloggs –d 87489f6a17044d66bc36704ce5c4e45c

For PowerCLI:

Connect-CIServer –Server <cloud endpoint> –SessionID <session ID string>


Connect-CIServer –Server –SessionID 87489f6a17044d66bc36704ce5c4e45c

You will then be connected as the same user from your browser session and able to run all the PowerCLI or vcd-cli commands with that user account.

An easier way?

Rather than digging around for HTTP headers and cookies in a browser, vcd-cli has a built-in module which is meant to retrieve the sessionID from a browser session automatically and use this to authenticate, the syntax is:

vcd login session list chrome

Which should return the session ID from an instance of Chrome, but in my initial testing this was not returning any output at all.

Reading through the vcd-cli sources it appears that this option relies on a Python extension ‘browsercookie’ which can be installed using pip install --user browsercookie. Browsercookie has a dependency on the ‘pycrypto’ module which must also be installed. However, even with both pycrypto and browsercookie installed I couldn’t get this option to work.

I did manage to get this working by installing the browser_cookie3 module from by using pip install --user browser-cookie3 and then making the following changes in the vcd-cli\ file:

Line 24: Change:

from vcd_cli import browsercookie to: import browser_cookie3

On both lines 126 and 148: Change:

cookies = to: cookies =

Once these changes are complete the ‘vcd login session list chrome’ command can be used to obtain the current session ID from Chrome automatically:


And this can be used directly to login automatically once a Chrome session exists using the --use-browser-session switch.

Also note that you can obtain the session ID like this from vcd-cli and use it to authenticate a PowerCLI session with no issues at all.