Installing and Using the vRealize Orchestrator (vRO) CLI

Something I was not aware of until recently was that vRealize Orchestrator (vRO) has it’s own Command Line Interface (CLI) environment. This can be an invaluable tool when developing new vRO workflows and actions as it allows you to easily test expressions and code snippets in your environment. Once developed these scripts and actions can be reused easily from vRO workflows in the vRO Workflow Designer.

Downloading vRO-CLI

Unfortunately vRO-CLI is not particularly well publicised (originating as a VMware ‘Fling’) and does not have official support, but it can still be a valuable tool. Due to the support status though I would only recommend installing and using this in a Test/Dev environment rather than in Production.

Locating the tool is currently a bit problematic, a google search for ‘vRO CLI’ will usually take you to this page: https://labs.vmware.com/flings/vco-cli:

image

Unfortunately this is an old version (from September 2015) and isn’t compatible with the latest (7.x) releases of vRO. To get to the latest version you will need to visit this page in the VMware community forums: https://communities.vmware.com/docs/DOC-31702 which has the build 4693774 downloads which work with the latest vRO versions:

image

You will need to download at least 2 of the .zip files, the vRO plugin itself (ending in .vmoapp.zip) and the client application for whichever OS you will be using on your development machine (Linux or Windows).

Installing vRO-CLI

Installation is in 2 parts, the first installs the plugin into your vRO appliance to provide the endpoint for the vRO-CLI. The second installs the vRO-CLI client on your workstation to be able to use the service.

1) Installing the vRO-CLI plugin on the vRO appliance

Once you have downloaded both packages, the first step is to install the vRO-CLI plugin into your vRO instance.

Open the web page for your vRO appliance and select the ‘Open Control Center’ link:

image

Once signed in to the Control Center, select the ‘Manage Plug-Ins’ icon:

image\

Unzip the file with the .vmoapp.zip extension you previously downloaded (e.g. o11nplugin-vcocli-2.0.0-4693744.vmoapp.zip) and use the Browse button to select the extracted file:

image

Click the ‘Upload’ button and when prompted accept the EULA agreement and click install:

image

You will get the following if everything has been successfully configured:

image

You need to let the Orchestrator service restart (just leave the configuration appliance and wait a couple of minutes) before the service will be available for client connections.

2) Installing the vRO-CLI Client

Since I’m using a Windows workstation for administration the following details the setup for a Windows vRO-CLI client. You will need to have Java installed and configured in your Windows OS prior to being able to run the vRO-CLI client.

Simply extract the .zip file (in this example, o11nplugin-vcocli-dist-2.0.0-4693774-windows.zip) to your machine, I used ‘C:\Program FIles\vRO-CLI-2.0.0’ as the destination directory:

image

You can now start the vRO-CLI client using either the GUI or command-line. To start using the GUI, start the vcocli-gui.bat script from the o11nplugin-vcocli-dist-2.0.0\bin folder, you should see the vCO CLI login dialog:

image

Use the hostname of your vRO appliance as the ‘vCO address’ (without any port specification) and supply valid vRO user name and password.

Note that if you have a firewall or other network security between your workstation and the vRO server you will need to permit tcp port 8265 between them to allow connection.

The ‘Session name’ can be anything you like and can be used to reconnect to sessions which have already been started and suspended. Then click ‘New’ to create a new session, if everything has gone well you should see the initial vRO-CLI screen:

image

Also note that the ‘Quit Session’ button will terminate your current session and you will not be able to reconnect to it, but closing the window using the close (X) icon in the top-left will keep the session running and allow re-attachment to the same session later.

To use the CLI version, you can start the vcocli.bat file from a Windows command prompt and specify the –vco and –username switches to specify the vRO server:

image

If you want to see what sessions already exist and can be reconnected, you can open vRO client and browse under the ‘vCO CLI / Start Session’ in the tree, each running ‘Start Session’ token will show in the ‘Variables’ tab the session name for sessions which can be reconnected (using ‘Attach’ in the GUI or the –resume switch on the command line):

image

Using the vRO-CLI

Once you have the client installed and connecting successfully to your vRO server, how do you actually use it? One of the early challenges I faced was working out how to actually reference an object in the vRO object browser in my script fragments. The ‘Help’ documentation provides some useful content, but doesn’t address how to obtain object references like this.

In this example, let’s assume that we’re trying to write a script to perform an action on the ‘test03’ VM in our environment owned by the vCD tenant ‘Tyrell’ in the VDC ‘Tyrell A03 Allocated’ and in the vApp ‘test03’. We can browse down the objects to locate this VM in the vRO-CLI UI:

image

Now the Server.findForType function can be used when supplied with an object type and the dunesId reference for our object:

var myTest03VM = Server.findForType('vCloud:VM','602d4ed77a4d20e9f854214a808ffcc2e878185e973d38446ad2bac2a623081////https://<my vCD server>/api/vApp/vm-98ae8d30-1090-4958-9061-f1a86590dc7b');

You can copy the value of the object by highlighting the ‘dunesId’ line in the object properties window and copy (Ctrl + C) and pasting (Ctrl + V) this into your command. (Note that this will also include the ‘dunesId’ text which will need to be removed).

This allows us to extract the output of the ‘toXml()’ method for our VM as follows:

image

All of the usual vRO methods and functions for the object are also available to us. If you need to initiate existing vRO workflows or actions the online help has details on how this can be initiated too.

Note: If you don’t highlight any text in the upper Input area, the entire script will be executed, but if you highlight a block of code or a single line, only that code will be executed when you click the ‘Execute’ icon (or press F5) which can be extremely useful to try out fragments of code and check the results are as expected.

Hopefully this will be useful to those of you developing workflows and actions in vRO and provide you with another method to write and debug your actions.

As always, comments and feedback are always welcome.

Jon.

First thoughts on AWS Re:Invent 2018 and where VMware is missing the point

Firstly I’ll start this off by saying that I don’t usually write opinion pieces – I’d much rather share some cool technology or tips that I’ve come across in my day job (or even playing with technology in my own time). Secondly, my perspective is likely a bit skewed – I work in one of the most virtualized and cloud-adopting countries in the world (New Zealand) where the vast majority of customers I speak to have no on-premises server environment any more or are planning to retire the ones they have. These customers have either shifted entirely to public cloud, or are using the services of providers such as my employer to provide local private cloud platforms for them.

In particular in Christchurch where I’m based we had a series of devastating earthquakes in 2010 and 2011 which accelerated this shift – many customers simply felt no need to rebuild datacenters when rebuilding their offices given available local provider and public cloud options open to them. Having a high-speed urban fibre network across the entire country has almost certainly helped to accelerate this trend.

For those not familiar, hosting providers such as ourselves who run a largely VMware software stack use the usual components (vSphere, vCenter, ESXi, NSX networking and sometimes VSAN storage) as the foundation of our platforms, but VMware provide an additional software layer ‘vCloud Director’ (vCD) which sits on top of all of these to provide a secure multi-tenant platform. It also provides a feature-rich public-facing portal and API to allow orchestration and automation as well as being extensible by a plugin architecture so that Operations Management, Backup/Recovery, Replication, Container hosting (to name 4) and other services can be easily integrated and published to our customers. The way this architecture has been implemented also makes it reasonably easy for 3rd parties to write their own vCD extensions and have these seamlessly published into the same environment.

Recent improvements in vCD 9.5 have included the move to a native HTML5 portal, the addition of many more customization and configuration options, support for multi-site deployments where customers consume resources in multiple physical locations, as well as new networking functionality allowing seamless networking across multi-site deployments. In addition, the new tenant portals for vCloud Availability Cloud to Cloud DR and vRealize Operations allow customers to completely manage their DR replication and failover as well as gain operational insights and management of their deployed workloads. This blog post covers the most recently enabled functionality in vCloud Director for those that wish to find more information.

Many of our customers are also leveraging public cloud infrastructure platforms for a variety of reasons including advanced features, hyper-scale elasticity, ease of operations and management and (often) to decrease their overall IT infrastructure spend. Often overlooked though is that the main driver for many customers is to have their internal IT teams concentrating on business applications and data – looking for ways to add business value to their organizations rather than in ‘feeding and watering’ infrastructure in their own datacenters. In fact many of the reasons customers chose a provider such as ourselves are very similar. The determining factors are often older applications that can’t survive at the reasonably significant latencies which are inevitable from New Zealand to our closest public cloud platforms in Australia, and some concerns around data sovereignty (although these are largely diminishing).

Of the AWS announcements made this week at their annual Re:Invent , AWS Outposts is a fascinating platform proposition which, if done well, will be a great option for a local AWS consistent platform in local (NZ) datacenters. I’m also  impressed with the announcements for new and enhanced AWS services – S3 Glacier Deep Archive could well spell the ‘final’ end of tape as a data archival technology for example. AWS Control Tower as a simplified way to easily deploy a landing zone into AWS is another service which I think will resonate well with many of our customers who find it challenging to deploy their initial AWS footprints with appropriate security, controls and governance. AWS TimeStream finally provides a ‘proper’ way of dealing with time-series data in huge volumes without trying to squish it into a relational database with all the issues that creates. Perhaps the most interesting is AWS RDS on VMware which was announced back in August at VMworld and allows AWS RDS services to run in a vSphere environment in a local datacenter with support for data replication and DR.

So with all that said, why do I think VMware is missing the point? – after all there are some great technologies and services being made available from both vendors.

Let’s take a look at some new/recent VMware products and services and see:

1) VMware Cloud Assembly

A great new technology to allow easy construction of templates and blueprints to speed deployment of application environments to multiple cloud endpoints. Supports all the major public cloud endpoints (AWS, Azure, GCP) as well as vCenter as the deployment endpoint.

2) VMware HCX (Hybrid Cloud Extension)

Awesome technology which allows live-migration of running business applications between vSphere sites (and even between vSphere on-premises and VMware cloud on AWS).

3) AWS RDS on vSphere

Mentioned above, but provides capability to run AWS consistent database services from all major RDS providers (MariaDB, MySQL, PostgreSQL, Oracle and SQL Server) in an on-premises vSphere environment. Can even allow these databases to span both an on-premises and AWS environment to provide scale, high availability and DR options.

4) VMware Cloud on AWS

Fantastic option that gives customers the option of deploying vSphere environments directly into public cloud and run their VMs with no changes whatsoever in that platform. Also with HCX (above) can live-migrate workloads in and out of public cloud. Provides consistent management, operations and security options across both platforms.

What do all of these have in common? You’ve probably guessed it – not a single one of them works with vCloud Director. If a customer wants to use HCX (for example) to seamlessly move workloads from a VCPP provider to AWS and back – not supported. Deploy Cloud Assembly blueprints to their VCPP provider? Nope, doesn’t work either, no support for the vCD API as an endpoint. Allow them to use AWS RDS services alongside their VCPP provider hosted VMs? No there also.

Basically it comes down to this: If you build a tool or service that only talks to vCenter (or vCenter APIs) and not to vCloud Director, you are missing out on making your products and services available to a large number (around 4,200 I believe right now) of VMware Cloud Provider Partners (VCPP) such as ourselves that offer vCloud Director as the primary interface and API for customers to manage their workloads. What’s more, from figures mentioned by VMware themselves, the number of workloads in VCPP provider datacenters managed through vCD is increasing massively ahead of vSphere and vCenter on-premises solutions.

One of the likely comments I’ll get to this post is ‘Well, you could just provided dedicated vSphere environments for each customer that needs these functions’. This is accurate – we definitely could do this, but the overhead of managing and maintaining a large number of discreet vSphere instances (including all of the management and operations tooling that these require) doesn’t scale well and would result in a huge amount of extra work. In addition, because we can’t securely multi-tenant vSphere environments there would be a huge amount of wasted capacity on hosts which aren’t heavily loaded or only exist to provide cluster hardware redundancy. This would make the solutions incredibly expensive by comparison to a true multi-tenanted platform.

So… if anyone from VMware is still reading by now… you’ve given your VCPP partners and providers an awesome platform in vCloud Director to allow a true multi-tenanted cloud platform, uptake and usage of this platform is in massive growth right now. Now please make sure the rest of your technologies can work with it.

As always, comments and feedback appreciated, how do other VCPP providers feel about this?

Jon

Update:

A few days after I first posted this, I saw this tweet from Steve Dockar pop up in my news feed linking to this youtube video which shows a preview version of Cloud Assembly using vCloud Director as the endpoint. This is awesome, and something which the VMware people I spoke to on the show floor at re:Invent obviously knew nothing about.

vCloud Director 9 HTML5 Portal Customization

One of the great features in vCloud Director 9 which has been further enhanced in the latest v9.5 release is the new HTML5 portal:

image

Even better, VMware has released a toolkit to allow Service Providers to fully customise the look and feel of the portal using CSS themes in their Clarity framework..

The toolkit itself is part of the VMware vcd-ext-sdk repository on github, available in the /ui/theme-generator folder.

The repository has good instructions on how to modify and build a custom theme, but actually uploading and configuring the theme in vCloud Director is only accessible via the vCD API and involves a reasonable amount of manual work.

To help speed up development and allow changes to be easily tested, in my usual mode I’ve written a small PowerShell module that allows quicker/easier theme configuration. The module is available on github at https://github.com/jondwaite/vcd-h5-themes. Hopefully this will help those of you who need to develop and test updated themes for your vCloud Director portals.

I’ve included documentation in the repository on each cmdlet, its function and arguments here.

To use the module you’ll need to be connected to a vCloud instance as a user with global ‘Administrator’ access in the ‘System’ organization since changes will affect all portal users. You’ll need to be connected to the vCD environment with PowerCLI (Connect-CIServer…) prior to using the cmdlets.

You can then download the vcd-ht-themes.psm1 file and add it to your session (‘Import-Module vcd-h5-themes.psm1’) to access the cmdlets.

As always, comments and feedback welcome – is there anything else you’d like to see added to this module?

Jon.

VM Guest Customization in vCloud Director via PowerCLI

Bit of a quick post this, but hopefully useful to others.

I got asked recently if there was an easy way to set Guest Customization options for VMs hosted in vCloud Director via Powershell/PowerCLI. It turns out there is an extremely simple way, but the syntax is a bit awkward so figured it would make a good/quick blog post.

The Guest Customization settings are available as one of the ‘Section’ entries returned by accessing the ExtensionData properties on a CIVM object. Once connected (Connect-CIServer) you can see this from PowerCLI:

image

The ‘trick’ is that there are typically 5 sections (one each for OvfVSSD, OvfMsg, network connections, guest Customization and VMware tools). I’ve seen some approaches that rely on the ‘guest Customization’ setting always being found at the Section[3] index in the ExtensionData collection, but this could easily change in future and break any functionality relying on this. A much more reliable way of finding the guest Customization section values is:

image

But how about if you need to change/update a setting, luckily there is a method provided (UpdateServerData) which does exactly this. So if we want to (for example) change the ‘CustomizationScript’ setting to “echo “Hello World!” we can:

image

You can change other settings using the same method (e.g. ComputerName or Domain join settings).

Note that for many changes the VM must be powered off, and you may need to ‘Power On and Force Recustomization’ too.

As always, comments & feedback appreciated.

Jon.

Getting detailed VM Disk Properties from the vCloud API

Since vCloud Director 8.10 VMware have allowed VMs to be created which have multiple disks using different storage policies. This can be very useful – for example, a database VM might have it’s database on fast storage but another disk containing backups or logs on slower/cheaper disk.

When trying to find out what storage is in use for a VM though this can create issues, the PowerCLI Get-CIVM cmdlet (and the Get-CIView cmdlet used to get extra information) aren’t able to properly report storage for VMs that consume multiple storage policies. This in turn can create problems for Service Providers when they need to report on overall VM disk usage divided by storage policy used.

As an example I’ve created a VM named ‘test01’ in a customer vDC which has 3 disks attached, the 2nd of these is on ‘Capacity’ tier storage while disks 1 and 3 are on ‘Performance’ storage. When we look at the VM details we see the following:

image

Digging into the ExtensionData shows

image

The StorageProfile element looks like it may contain what we need, but unfortunately this only shows the ‘home’ Storage for the VM and doesn’t indicate that at least one of the VMs disks is on a different storage profile:

image

After a lot of mucking around trying to find an easy way to discover the information, I ‘gave up’ and wrote a PowerShell module which accesses the vCD API directly to get the VM storage information (including storage tiers in use by each disk). The module isn’t overly efficient since it queries the storage profile reference for every disk on every VM (and so will result in a lot of calls if run for a large number of VMs), but otherwise works fine.

The module takes VM objects or a VM name as input and returns details on each disk attached to the VM including which storage profile they use. Save the script (e.g. as ‘Get-CIVMStorageProfile.psm1’) and then use ‘Import-Module .\Get-CIVMStorageProfile.psm1’ to import the function.

And here is example output from the script for our test VM:

image

Hope this is useful to some of you and as always, appreciate any comments/feedback.

I’d also love to know if there’s an easier way of generating this information.

Jon.

Tenant Portal Displays ‘No Datacenters are available’ in vCloud Director 9.1

We had an issue recently when updating our vCloud Director environment to v9.1 where the new tenant portal would show ‘No Datacenters are available’ for every tenant even though the remainder of the site worked correctly (and other tabbed options like the Service Library & catalogs worked fine). Initially we suspected that our SSL certificate chain or public URI’s were set incorrectly.

Adrian Begg has a great blog post here: http://www.pigeonnuggets.com/2018/03/vcloud-director-9-1-tenant-portal-displays-no-datacenters-available-after-upgrade/ which details this issue and how to ensure the correct settings are applied, however in our case this didn’t resolve our issue.

Eventually an offhand remark in a slack channel by Tom Fojta put me on the right track to solving the issue, I’ve written this post up in case anyone else comes across the same issue. If you’re impatient and want to know the solution – it’s DNS (isn’t it always DNS?), but that’s jumping ahead a bit.

In our environment we have 3 vCloud Director cell servers behind a load balancer, we also load-balance internally so that our management environment can talk to the vCD API and we can conduct testing of the environment without necessarily having it open to the public internet. The arrangement looks logically like this:

 

vCloud Director Load Balancing

Users from the internet accessing ‘portal.cloud.com’ get redirected to one of the vCD cell servers (and if one of them is unavailable the monitoring on the Load Balancer doesn’t direct requests there). The same happens for internal users, but in this case the ‘portal.cloud.com’ DNS entry has been overridden to point at the internal (192.168.0.10) address to allow connectivity to the cells even if the external LB or internet link is unavailable.

The issue in our environment was that the cell servers themselves use DNS to access the vCloud API – and they use the public URL specified in the vCloud Director configuration.

The cell servers were configured with our internal DNS servers, so when they attempted to access the public URL (‘portal.cloud.com’) were being given the internal Load Balancer address (192.168.0.10). For reasons we’re still exploring, this didn’t allow them to get a response from the vCD API and resulted in the ‘No Datacenters are available’ error in the tenant portal.

The fix turned out to be reasonably simple – on each cell server we added an entry to the /etc/hosts file to resolve the public URL to the cell’s own IP address, so on cell 01:

192.168.0.11    portal.cloud.com

On cell02:

192.168.0.12   portal.cloud.com

And on cell03:

192.168.0.13    portal.cloud.com

Once we’d made this change the tenant portals began functioning correctly (note that no restart of the cell servers or vCloud Director services was required).

What I assume is happening is that when the internal load balancer responds the the request it gives out a different cell server’s address (since the ‘source’ of the request will be a cell server) and that cell server has no knowledge of the session being used by the original cell and so responds incorrectly (either with nothing, or with an error). Not sure if this is actually a bug, or just something to be aware of, but either way overriding name resolution in this way fixes the issue. Note that simply using ‘localhost’ or 127.0.0.1 for the hosts file entry doesn’t work since the vCloud web server doesn’t respond on the loopback interface in the default configuration.

Just posting this here in the hope it will save someone else any frustration caused by this issue.

Jon.

Using VMware Container Service Extension (CSE)

Yesterday I wrote showing the currently available container hosting options from VMware. As we’ve recently deployed one of these options – CSE in our environment I thought it would be useful to show a sample workflow on how the service functions and how customers can use this to deploy and manage both CSE clusters, and also micro-service applications onto those clusters.

There are a few requirements on the tenant side which must be completed prior to any of this working:

  • An Organizational Administrator login to the vCloud platform where CSE is deployed.
  • Access to a virtual datacenter (VDC) with sufficient CPU, Memory and Storage resources for the cluster to be deployed into.
  • An Org VDC network which can be used by the cluster and has sufficient free IP addresses in a Static Pool to allocate to the cluster nodes (clusters take 1 IP address for the ‘master’ node and an additional address for each ‘worker’ node deployed).
  • A client prepared with Python v3 installed and the vcd-cli and container-service-extension packages installed on it.
  • The {$HOMEDIR}\.vcd-cli\profiles.yaml file edited to add the CSE extension to vcd-cli.
  • The kubectl utility installed to administer the Kubernetes cluster once deployed and working. kubectl can be obtained most easily from here.

Detailed instructions for the client setup can be found in the CSE documentation at https://vmware.github.io/container-service-extension/#tenant-installation. Note that on a Windows platform the .vcd-cli folder and profiles.yaml file will not be automatically created, but you can do this manually by

from a DOS prompt and then using vcd-cli to log in and out of your cloud provider. This will cause profiles.yaml to be generated in the .vcd-cli folder. The profiles.yaml file can then be edited in your favourite text editor to add the required CSE extension lines.

Deploying a Cluster with CSE

When deploying a cluster, you will need to know the storage profile and network names which the cluster will use, the easiest way of obtaining these is either from the vCloud portal, or using the vcd vdc info command when logged in to your environment:

image

If you have multiple VDCs available to you, the ‘vcd vdc use <VDC Name>’ command to set which one to work with.

In this example we will be using the highlighted entries (the ‘Tyrell-Servers’ network and the ‘CHC Performance’ storage profile).

To retrieve a list of available cluster deployment templates that the Service Provider has made available to us we can use the vcd cse template list command:

image

In this example only the Photon OS template is available and is also the default template. CSE actually comes with 2 profiles (Photon OS v2 and Ubuntu Linux 16-04, but I’ve only installed the Photon OS v2 template in my lab environment). The default template will be used if you do not specify the ‘–template’ switch when creating a cluster.

The cluster create command takes a number of parameters which are documented in the CSE page:

image

Be careful with the memory specification is it is in MB and not GB.

I chose to generate a public/private key to access the cluster nodes without needing a password, but this is optional. If you want to use key authentication you will need to generate a key pair and specify the public key filename in the cluster creation command using the –ssh-key switch.

To deploy a cluster with 3 worker nodes into our VDC where each node has 4GB of RAM and 2 CPUs using my public key and the network and storage profile identified above:

image

The deployment process will take several minutes to complete as the cluster VMs are deployed and started.

In to the vCloud Director portal, we can see the new vApp that has been deployed with our master and worker nodes inside it, we can also see that all 4 VMs are connected to the network we specified:

image

To see the details of the nodes deployed we can use ‘vcd cse node list <cluster name>’:

image

To manage the cluster with kubectl, we need a configuration file for Kubernetes containing our authentication certificates. kubectl by default looks for a file named ‘config’ in a folder called ‘.kube’ under the current user’s home directory. The config file itself can be downloaded using CSE. To create the folder and write the config file:

image

If you have multiple deployed clusters you can create separate config files for each one (with different file names) and use the –kubeconfig= switch to kubectl to select which one to use.

To test kubectl we can ask for a list of all containers (‘pods’ in Kubernetes) from the cluster, the ‘–all-namespaces’ switch shows system pods as well as any user created pods (which we don’t have yet). This must be run from a machine that has network connectivity with the deployed nodes (the ‘Tyrell-Servers’ network in this example):

image

 

Cluster Scaling

Adding Nodes to Clusters

If we need to add worker nodes to a cluster this is accomplished with the ‘vcd cse node create’ command. For example, we can add a 4th worker node to our ‘myCluster’ cluster as follows:

image

The node list now shows our cluster with 4 worker nodes including our new one:

image

Removing Nodes from Clusters

To remove a cluster member is just as easy using the ‘vcd cse node delete’ command:

image

You will be prompted to confirm the node deletion, and if you have deployed container applications you should ensure that the node is properly drained and/or replica sets and deployments configured correctly so that the node deletion will not impact your applications.

 

Cluster Host Affinity

One item that CSE does not deal with yet is creating vCloud Anti-Affinity rules to ensure that your worker nodes are spread across different physical hosts. This means that with appropriately configured applications a host failure will not impact on the availability of your deployed services. It is reasonably straightforward to add anti-affinity rules in the vCloud portal though.

Our test cluster is back to 3 nodes following the deletion example:

image

In the vCloud portal we can go to ‘Administration’ and select our virtual datacenter in the left pane, we will then see an ‘Affinity Rules’ tab:

image

Clicking the ‘+’ icon under Anti-Affinity Rules allows us to create a new rule to keep our worker nodes on separate hosts:

image

Provided the VDC has sufficient backing physical hosts, the screen will update to show the new rule and that it has successfully been applied and separated the worker nodes to different hosts:

image

Of course if the host running the master node experiences a failure then this will be unavailable until the VMware platform restarts the VM on another host.

 

Application Deployment using kubectl

Of course now that our cluster is up and running, it would be nice to actually deploy a workload to it. The ‘sock shop’ example mentioned in the CSE documentation is a good example application to try as it consists of several pods running in a separate namespace.

First we use kubectl to create the namespace:

image

Now we can deploy the application into our name space from the microservices-demo project on github. You can read more about the sock-shop demo app at https://github.com/microservices-demo/microservices-demo.

C:\Users\jon>kubectl apply -n sock-shop -f "https://github.com/microservices-demo/microservices-demo/blob/master/deploy/kubernetes/complete-demo.yaml?raw=true"
deployment "carts-db" created
service "carts-db" created
deployment "carts" created
service "carts" created
deployment "catalogue-db" created
service "catalogue-db" created
deployment "catalogue" created
service "catalogue" created
deployment "front-end" created
service "front-end" created
deployment "orders-db" created
service "orders-db" created
deployment "orders" created
service "orders" created
deployment "payment" created
service "payment" created
deployment "queue-master" created
service "queue-master" created
deployment "rabbitmq" created
service "rabbitmq" created
deployment "shipping" created
service "shipping" created
deployment "user-db" created
service "user-db" created
deployment "user" created
service "user" created

We can see deployment status by getting the pod status in our namespace:

image

After a short while all the pods should have been created and show a status of ‘Running’:

image

The ‘sock-shop’ demo creates a service which listens on port 30001 on all nodes (including the master node) for http traffic, so we can get our master node IP address from ‘vcd cse node list myCluster’ and open this page in a browser:

image

And here’s our deployed application running!

image

Summary / Further Reading

Of course there’s much more that can be done with Docker and Kubernetes, but hopefully I’ve been able to demonstrate how easily a cluster can be deployed using CSE and how micro-services applications can be run in this platform.

For further reading on kubectl and all the available functionality I can recommend the Kubernetes kubectl documentation at https://kubernetes.io/docs/reference/kubectl/overview/. In fact the entire Kubernetes site is well worth a read for those considering deployment of these architectures.

As always, comments, feedback, suggestions and corrections always welcome.

Jon.

VMware Container Solutions

VMware appears to have gone a little ‘mad’ with regards to containerisation (or containerization for any American readers) lately. Last week saw the release of Pivotal Container Service (PKS) as launched at VMworld 2017 US back in August. With this there are now a total of three VMware technologies all enabling customers to run micro-service applications in their environments. So why three different products to do the same thing? Well, they are targeted at different environments and use-cases, and actually it makes a lot of sense for VMware to have solutions for all 3 scenarios. Of course there’s always the 4th option of building your own container hosting platform from scratch on a VMware platform, but lets concentrate for now on those provided by VMware.

So what are the available solutions?

Pivotal Container Service (PKS)

This was announced at VMworld 2017 and recently became available for download. PKS a full stack solution to manage both initial formation of clusters to support containerised applications and manage their ‘day 2’ operations once deployed. While PKS could be deployed in an Enterprise environment (and may be for organisations using containerised applications at significant scale) it appears to be more targetted towards cloud service providers wishing to offer a managed/hosted platform for multiple tenants.

vSphere Integrated Containers (VIC)

VIC has been around for a while now (this was based on VMware’s Project Bonneville which started back in 2015), recently VIC has been updated to v1.3.1 and gained the capability to use Docker hosts natively at version 1.2 (prior to this VMware Host Containers had to be used). VIC supports vSphere version 6.0 and upwards and is primarily targeted at Enterprise customers wishing to provide a managed container hosting environment within their own infrastructure.

Container Service Extension (CSE) for vCloud Director

Sitting somewhat in between the other offerings, VMware has also released CSE via an open source Github repository. CSE is targeted at Service Providers using VMware’s vCloud Director platform who wish to make delivering container hosting to tenants much easier. It provides an extension to vCloud Director which allows the creation and maintenance of clusters of VMs providing Docker in Kubernetes clusters.

Comparing the solutions

The table below shows a summary of the options

 

Solution Pivotal Container Service (PKS) vSphere Integrated Containers (VIC) Container Service Extension (CSE)
Current release / link v1.0.0 GA v1.3.1 v0.4.2
Container Runtime Docker Docker & Virtual Container Host (VCH) Docker[1]
Container Management Kubernetes VMware Admiral Kubernetes[1]
Container OS BOSH Virtual Container Host (Photon OS based) Any (Ubuntu & Photon provided)
Container Registry VMware Harbor VMware Harbor Any (None provided)
Deployed to Bare metal / VM Bare metal / vSphere VM vCloud Director Virtual Datacenter (VDC)
Multi-tenant Supported Yes Yes Yes
Network Support VMware NSX-T vSphere & VMware NSX-V Org VDC Networks (vCloud Director) / VMware NSX-V
Licensing / Support Open Source, Paid Support available from Pivotal Open Source, vSphere S&S Support covers VIC Open Source, Service Provider Support
Primarily Targeted At Service Providers & Enterprise using containers at scale Enterprise Service Provider / vCloud Tenants

[1] CSE allows service providers to provide any versions of Docker and Kubernetes in their templates. This can allow much more up-to-date versions than those supported in PKS or VIC.

CSE deployment for a Service Provider

I’ve recently been involved with deploying CSE to our own vCloud Director hosting platform, the VMware github.io page is extremely useful and well documented to help get up and running with CSE so I won’t repeat this here.

The main advantages it offered us as a service provider:

No new billing / Integration required
This is a huge deal for most service providers, it can be time-consuming (and therefore expensive) to integrate any new platform offering, not just in the time taken to deploy the components and get them all working correctly (including alerting, monitoring etc.) but what is often overlooked is the additional effort required to correctly meter platform consumption and ensure that customer bills are correctly prepared and reflect the resources their environments have consumed. Taking the ‘full stack’ of PKS and offering this as a service would involve considerable work, but with CSE this workload is effectively neutralised since the clusters deployed are directly into tenant virtual datacenters (VDCs) and service providers will already be metering and billing customers for resources consumed in tenant VDCs.

No new licensing
As there is no additional licensing for CSE this makes it extremely easy to deploy in a service provider platform.

No new security model
Since all tenant interaction with CSE is via the vCloud Director API, there is very little work required (if any) to publish the service to customers since most Service Providers will already be making the vCD API accessible to their tenants. Additionally, since the CSE service itself integrates directly into vCloud Director’s RabbitMQ backend it is likely that very few security or firewall changes are required either.

Flexible environment
One of the really nice aspects of CSE is that the templates made available to tenants to deploy into clusters are fully customisable. This means that service providers can chose to offer additional templates beyond the 2 examples provided ‘out of the box’ with CSE. For example, if a Service Provider wishes to offer a ‘bleeding edge’ template which has the absolute latest releases of Docker and Kubernetes (and maybe add additional packages to the deployed images to include Harbor and maybe a ceph or glusterfs client) this is reasonably straightforward and easy to do. The downside of this is that maintenance and updating these templates has to be performed regularly to ensure that they include all appropriate bug-fixes and security patches and updates.

Note that at this time the VMware documentation doesn’t yet include instructions for modifying or adding additional CSE templates, I’ll write up a separate post on how I did this in our environment which may prove useful for others deploying CSE into their own environments.

Other CSE Considerations

Of course no platform is ever perfect, and the following should be noted too as potential pitfalls or things to be aware of when considering CSE:

No registry service by default
Both PKS and VIC provide container registry services (to deal with storing, securing, scanning and replicating container images) based on VMware’s Project Harbor which is a very nice registry system. While Harbor can be added to clusters deployed with CSE, it isn’t there by default in the templates currently provided with CSE.

No persistent or dynamic Kubernetes volumes
Containers by design are meant to be ephemeral and stateless, so they shouldn’t be storing any persistent data or require backup protection. Of course most business applications (including those provided by containerised images) generally need some form of permanent/persistent storage behind them. In Kubernetes environments this is generally accomplished by the concept of persistent volumes which are mapped into containers at runtime and allow data to be retained. In CSE currently there is no provider for persistent volumes which means that external storage is required. This can however be delivered from a variety of sources – other databases running in the environment, file or object storage services etc. I’m currently looking into easy ways to add dynamically provisioned persistent volumes to a CSE cluster and will write this up as a separate post when done.

Template maintenance
As mentioned previously, the templates deployed by CSE are completely flexible and can be easily customised by editing their deployment scripts, the process of maintaining the templates is reasonably manual though and requires stopping the CSE service, patching and updating the templates in a vCloud Director shared catalog and then re-enabling the CSE service. It would be nice to have a way to automate the rebuild of templates and to allow the CSE services to remain online while this is happening.

Relative immaturity
The CSE service is a very ‘early’ release and a number of bugs are still being fixed. There’s nothing too serious that I’ve encountered yet, but occasionally templates will fail to build correctly (generally due to failures in 3rd party repositories) and it can take time to identify and resolve these issues. Fortunately the VMware developers have been extremely fast and active in responding to issues raised in the CSE github repository and every issue I’ve found has been very quickly fixed.

Summary

Hopefully this post has given you an idea of the capabilities and features available in the 3 current VMware container hosting solutions and given you a better idea of what the Cloud Service Extension for vCloud Director does. I’m aiming to write some follow-up posts on CSE including how we have deployed it into our environment, how new templates can be created (and existing templates customised) and how to address some of the current missing features such as integrating Harbor as a registry service in future posts. Let me know in the comments if there are any areas you are particularly interested in and I’ll see what I can do. I’ve also written a session abstract proposal to present a Service Provider view of CSE at VMworld US 2018, so hoping that that will be accepted too.

References / Links

Some of the components mentioned may not be familiar so I’ve provided links to each one below:

BOSH: https://bosh.io/
Docker: https://www.docker.com/
Harbor: https://vmware.github.io/harbor/
Kubernetes: https://kubernetes.io/
Ubuntu Linux: https://www.ubuntu.com/
VMware Photon OS: https://vmware.github.io/photon/

As always, comments & corrections welcome, I’m reasonably new to the whole ‘containerised applications’ scene so there may well be inaccuracies in this post(!)

Jon.

vCloud Director Extender – Part 5 – Stretch Networking (L2VPN)

In this 5th part of my look into vCloud Director Extender (CX), I deal with the extension of a customer vCenter network into a cloud provider network using the L2VPN network extension functionality. Apologies that this post has been a bit delayed, turned out that I needed a VMware support request and a code update to vCloud Director 9.0.0.1 before I could get this functionality working. (I also had an issue with my lab environment which runs as a nested platform inside a vCloud Director environment and it turned out that the networking environment I had wasn’t quite flexible enough to get this working).

Update: an earlier version of this article didn’t include the steps to configure the L2 appliance settings in the vCloud Director Extender web interface – I’ve now added these to provide a more complete guide.

Links to the other parts of this series:
Part 1 – Overview
Part 2 – Cloud Provider / Service Provider installation and configuration (MyCloud)
Part 3 – Customer / Tenant installation and configuration (Tyrell)
Part 4 – Customer / Tenant connecting to a Cloud Provider and Virtual Machine migration (Tyrell)

I won’t deal with the use-case here that the customer already has NSX networking installed and configured, since in most cases you can simply create L2VPN networks directly between the customer and provider NSX Edge appliances and don’t really need to use the CX L2VPN functionality.

In order to be able to use the standalone L2VPN connectivity, the following pre-requisites are required:

  • A tenant vSphere environment with the vCloud Director Extender appliance deployed (it does not appear to be necessary to deploy the replication appliance if you only wish to use the L2VPN functionality, but obviously if you are intending to migrate VMs too you will need this deployed and configured as described in Part 3 of this series. In either case you will still need to register the cloud provider in the CX interface.
  • A configured vCloud Director VDC for the tenant to connect to. This environment must also have an Advanced Edge Gateway deployed with at least one uplink having a publicly accessible (internet) IP address. Note that you do not need to configure the L2VPN service on this gateway – the CX wizard completes this for you.
  • At least one OrgVDC network created as a subinterface on this edge gateway. The steps to create a suitable new OrgVDC network are detailed below.
  • Outbound internet connectivity to allow the standalone edge deployed in the tenant vCenter to communicate with the cloud-hosted edge gateway – only port 443/tcp is required for this.
  • Administrative credentials to connect to both the tenant vCenter and the cloud tenancy/VDC (Organization Administrator role is required).

Opening the tenant vCenter environment and selecting the ‘Home’ page shows the following:

Selecting the vCloud Director Extender icon opens the CX interface:

If you have not yet configured the L2 appliance settings, selecting the ‘DC Extensions’ tab will show the following error:

To fix this, open the vCloud Director Extender web interface in a browser by opening https://<ip address of deployed cx appliance>/ and log in, select the ‘DC Extensions’ tab:

Select the ‘Add Appliance Configuration’ option and complete the form to provide the deployment parameters where the standalone NSX edge appliance will be deployed:

The ‘Uplink Network Pool IP’ setting is a bit strange – it appears to be asking for a network pool or IP range, but the ‘help text’ in the field is asking for a single IP address. I found that the validation on this field is a bit odd – it will basically accept any input at all (even random strings) without complaining, but obviously deployment won’t work. What you need to do is add individual IPv4 addresses and click the ‘Add’ button for each. You will need 1 address for each stretched network you will be extending to your cloud platform. In this example I am only extending a single network so have added a single IPv4 address (192.168.0.201).

Once you click the ‘Create’ button you will be returned to the ‘DC Extensions’ tab and shown a summary of the L2 appliance configuration:

Note that there doesn’t appear to be any way to edit an existing L2 Appliance configuration, so if you need to change settings (e.g. to add additional uplink IP pool addresses) you will likely need to delete and recreate the entire entry.

 

Next we need to add a new ‘subinterface’ network to our hosted Edge gateway appliance, logging in to our cloud provider portal we can select the ‘Administration’ tab and the ‘Org VDC Networks’ sub-option, clicking the ‘Add’ button shows the dialog to create a new Org VDC Network. We need to select ‘Create a routed network by connecting to an existing edge gateway’ and then check the ‘Create as subinterface’ check box:

Next we configure the standard network information (Gateway, Network mask, DNS etc.) Since this network will be bridged to our on-premises network we can use the same details. Optionally a new Static IP pool can also be created so that new VMs provisioned in the cloud service can use this pool for their IP addresses. This won’t be an issue for VMs being migrated as they will carry across whatever IP addresses are already assigned to them. Note that the gateway address is set to be the same address as the existing (on-premises) gateway – this means that re-configuring the default gateway setting in the guest OS isn’t required either:

Now we supply a name for the new Org VDC network and optionally a description. The check box can also be used if the customer has multiple VDCs and wishes to share the new network across them:

Finally the summary screen allows us to check the information provided and go back and make any changes required if not correct. The most important setting is to make sure the network is attached to the edge gateway as a subinterface:

Once finished creating, the Org VDC network will be shown in the list with a type of ‘Routed’ and an interface type of ‘Subinterface’:

Next we access the vCloud Extender interface from within the customer vCenter plugin, selecting the ‘DC Extensions’ tab takes us to the following dialog:

Selecting ‘New Extension’ shows the dialog to create a new L2 extension, the fields are mostly populated for you. The ‘Enable egress’ allows you to select which gateway(s) will be allowed to forward traffic outside of the extended network. In this example I’ve only configured egress on the Source (on-premises) side through the existing gateway:

When you click ‘Start’, the status will go to ‘Connecting’ and a number of activities will take place in the customer vCenter:

Reading from the bottom (oldest) upwards, a new port group is created, an NSX Edge Standalone appliance is deployed and powered-on and the new port group is reconfigured once this has completed (ignore the VM migration task, that just happened to occur during the same time window in my lab). In this case the new NSX standalone edge was named ‘mcloudext-edge-4’ and the port group ‘mcxt-tpg-l2vpn-vlan-Tyrell-VDC15’.

Once deployment has completed (takes a few minutes) the vCloud Extender client interface shows the new DC extension network with a status of ‘Connected’:

In the tenant vCloud Director portal you can also see the status of the tunnel under ‘Statistics’ and ‘L2 VPN’ within the edge gateway interface:

You will now find that any VMs connected to the stretched network (OrgVDC network) in your cloud environment have L2 connectivity with the on-premises network and will continue to function as if they were still located in the customer’s own datacenter.

As I mentioned at the start of this post, I hit a number of issues when configuring this environment and getting it working took several attempts and a couple of rebuilds of my lab. The main issue was that in the initial release of vCloud Director v9.0.0.0 there is an issue that prevents the details required for the standalone NSX edge being deployed from being returned by the API. This prevents the deployment of the customer edge at all and resulted in my VMware support call. The specific issue is referenced in the vCloud Director 9.0.0.1 release notes  as ‘Resolves an issue where the vCloud Director API does not return a tunnelID parameter in response to a GET /vdcnetworks request sent against a routed Organization VCD network that has a subinterface enabled.’ As far as I can work out, it will be impossible to successfully use L2VPN in CX without upgrading the provider to vCloud Director 9.0.0.1 to resolve this issue.

The other issue I hit in my lab was that my hosted ‘Tenant Edge’ was NAT’d behind another NSX Edge gateway which was also performing NAT translation (Double-NAT). This was due to the way my lab is built in a nested environment inside vCloud Director. Unfortunately this meant the external interface of my hosted ‘Tenant Edge’ was actually an internal network address, so when the customer/on-premise edge tried to establish contact it was using an internal network address which obviously wasn’t going to work. I solved this by connecting a ‘real’ external internet network to my hosted Tenant Edge.

As always, comments and feedback always appreciated.

Jon.

vCloud Director Extender – Network Ports

One of the things which appears to be missing from the published documentation on vCloud Director Extender (CX) is any mention of the communications internally between the deployed appliances and other VMware infrastructure components (vCenter, vCloud Director etc.) In a service provider context it is unlikely that the appliances will be deployed into the same network/security zone as these components so it is important to know what these communication requirements are.

Using the Flow Monitoring functionality in VMware NSX I was able to capture all traffic flows during vCloud Extender migrations and produce the drawing below detailing these traffic flows.

Network Traffic Flows for vCloud Extender (Provider Side)

 

Note that the http (tcp/80) access from the replicator appliance to the ESXi hosts appears anomolous – I would have expected this to be on https (tcp/443) at the very least and this probably needs further investigation.

The 8044/tcp port to the replication manager can be NAT’d from a different external (public) port if necessary – this can be configured using the ”Public Endpoint URL” field when activating the replication manager appliance during vCloud Extender deployment (see my post: http://kiwicloud.ninja/2017/10/vcloud-director-extender-part-2-cloud-provider-setup/).

The 44045/tcp port to the replicator appliance can also be NAT’d from a different external (public) port if necessary – this can be configured using the “Public Endpoint URL” field when activating the replicator appliance during vCloud Extender deployment  (see my post: http://kiwicloud.ninja/2017/10/vcloud-director-extender-part-2-cloud-provider-setup/).

Be careful when activating the “Replication Manager” and “Replicator” appliances – the configuration screens look very similar and it is reasonably easy to get them mixed up and enter incorrect parameters.

Also note that this diagram only depicts traffic flows for migration activity and doesn’t capture additional flows involved in L2 network extensions (which typically will be from a hosted NSX edge to either the tenant NSX edge or standalone NSX appliance in the tenant site).

At least the information presented should allow other service providers to configure appropriate network security to protect their internal vCloud and vSphere environments when deploying vCloud Extender components into a DMZ network (for example).

As always, comments and feedback appreciated.

Jon