Tuesday, 27 October 2020

HCX Central CLI Mind Map

In a previous article I introduced the HCX Central CLI (CCLI) as a tool to help troubleshoot HCX connectivity and performance issues. I've created a quick Mind Map of all the various commands that are available through CCLI to ease the troubleshooting process.

You can download a .pdf version of the full Mind Map here

This is correct as of HCX Version R144 (Build 16989452)

Monday, 26 October 2020

Troubleshooting HCX Connectivity and performance issue into VMware Cloud on AWS

When working with customers on VMware Cloud on AWS POC's or Pilots a lot of the success criteria typically includes using Hybrid Cloud Extension (HCX) to migrate workloads from on-premises into VMware Cloud on AWS either using bulk migration or live vMotion. This sometimes involves troubleshooting connectivity and performance issue so I figured I would show the typical process that I follow to try and narrow down and identify the issue.

HCX Service Mesh Appliances Tunnel Status

Wednesday, 16 September 2020

VMC Sizer now accepts RVTools and LiveOptics Inputs

When it comes to sizing customer environments who are looking to migrate from on-premises into VMware Cloud on AWS we always use the VMC Sizer website. The tool allows us to input and modify various parameters to correctly size the environment. Basic values that we need to gather from customers to size the environment are:

Number of VMs

Storage per VM

vCPU per VM

vRAM per VM

Thursday, 23 January 2020

North East VMUG - Thursday 6th February 2020

The first North East VMUG of 2020 is ready for registration and boy is it going to be a good one. The event will take place on Thursday 6th February at the Royal Station Hotel which is situated right next to Newcastle Central Station. Once again, the team have put on a spectacular agenda with some great VMware speakers as well as community sessions. Special thanks to all the sponsors who fund these events and without them these events would not happen so during the breaks go spend some time with them and see what they have to offer:


The current agenda is as follows:

VMware Keynote - Frank Denneman (Blog | Twitter) - Chief Technologist at VMware
Title - VMware’s Hybrid Cloud Vision and Strategy

In this session, Frank will discuss VMware’s 3-year Hybrid Cloud Vision and Strategy. He’ll discuss industry trends driving our thinking and then lay out our vision for how we can evolve our core platform to support where customers and the industry is going. re provides a lot of value, especially in large-scale Kubernetes deployments.

VMware Keynote - Duncan Epping (Blog | Twitter) - Chief Technologist at VMware 
Title - How HCI is revolutionizing the datacenter today and tomorrow! 

A year has passed and a lot has changed since then. Not just for you, but also for HCI and vSAN in particular. In this session, Duncan will discuss where we are coming from, but more importantly where we are going. Be warned, this session will include forward-looking statements and demos of to be released features.

Community Session - Kyle Jenner (Blog | Twitter) - AWS Architect at Rackspace
Title - VI admin to Cloud Architect

A little northerners journey to the cloud - tips and tricks learned along the way and thoughts on the future.

Community Session - Craig Dalrymple (Blog | Twitter) - Senior Systems Engineer at Brightsolid
Title - ...And now for something completely different

A frank and open presentation about mental health in IT, it is real and it really can happen to anyone ...did I mention there will be Irn Bru?

Community Session - Michael Sweeney (Twitter) - Virtualistion Specialist at DXC.Technology
Title - Modern Day Infrastructure Management using vRealize Ops

Overview and lessons learnt from a vROps deployment and what advantages this brought to make more informed business decisions.

VMware Session - Andy Watson (Twitter) - Systems Engineer at VMware
Title - Virtual Cloud Networking

Andy Watson, a Solution Engineer in VMware’s Network and Security Business Unit, gives an update on the Virtual Cloud Network. In the last 3 years, VMware has gone from a single networking product to providing complete consistent networking and security from the Datacentre to the Edge and the Cloud.

This is a great event to network with peers and experts from VMware as well as have a few sneaky beers since its no longer dry January :)

See you there


Tuesday, 21 January 2020

Protecting a 3-Tier App in VMC with the NSX Distributed Firewall

VMware Cloud on AWS SDDC v1.9 was officially released on the 16th January 2020 and includes a wealth of enhancements with regards to the NSX Distributed Firewall and Inventory Groups. The release notes for SDDC v1.9 can be found below if you want to check out all the new features that are available with this release:


In this article, I am going to show you how to protect a simple 3-Tier Application using the NSX Distributed Firewall and some of the enhancements around Inventory Groups. A policy-based VPN has already been established from on-premises into VMC and the application has already been deployed into a single network segment as per the diagram below:

A policy-based VPN has been configured to allow the following traffic:

On-Premises to VMC - -

VMC to On-Premises - -

The on-premises firewall has been configured to allow all traffic between the network segments and an ANY - ANY - Allow rule has been configured on the Compute Gateway Firewall to allow all traffic into the 3-Tier App Semgemt. I could lock this down but I want to focus on the Distributed Firewall security aspects for this particular blog post:

Prior to configuring the distributed firewall, I check the application is working as expected by accessing the webserver from the on-premises management segment ( to verify connectivity:

Since this is the only application that will currently be running in VMC and I want to ensure maximum security, I switch from a Blacklist security approach to Whitelist with logging. This a new feature of VMC and set's the default ANY - ANY rule to deny with logging rather than allow. When setting the security approach you have the following options:

Blacklist - This option creates a default ANY-ANY rule to allow all traffic. This is the default option for the distributed firewall.
Blacklist with logging - This option creates a default ANY-ANY rule to allow all traffic with logging enabled.
Whitelist - This option creates a default ANY-ANY rule to block all traffic. All communication is denied access including DHCP traffic.
Whitelist with logging - This option creates a default ANY-ANY rule to block all traffic with logging enabled. All communication is denied access including DHCP traffic.

The option can be changed in the Cloud Services Portal (CSP) under Networking & Security then Distributed Firewall:

I verify connectivity has been dropped by pinging my WEB, APP and DB servers as well as try to access the application via

The Distributed Firewall rules that I want to implement are as follows:

I only want to allow SSH and ICMP from my on-premises management segment to the 3-Tier App for support/troubleshooting purposes. I then want to allow anything to talk to my webserver but only over HTTP (TCP port 80). My web server needs to talk to my app server also over HTTP and finally, the application servers communicates with the database server using the MySQL service (TCP port 3306). Since we have a whitelist security approach all other communication will be blocked thus preventing any lateral movement by any malicious hacker. An example of this would be a hacker compromising my web server and laterally trying to move, say over a known MySQL exploit, into the database server to extract any data.

I’m going to use NSX Inventory Groups as the basis for my DFW ruleset and dynamically add members to the groups based on tags, which is new with v1.9 of the SDDC. Using tags has the added benefit that if I need to scale out any portion of my application i.e. add more web servers then I can simply deploy the server, assign a tag to the server an automatically ensure the correct security posture is applied to the workload without having to modify any firewall rules. I’m going to start by creating the following Inventory Groups with the following membership criteria:

Web Servers - Dynamically add workloads to the group when a Web tag is applied to the VM
App Servers - Dynamically add workloads to the group when an App tag is applied to the VM
DB Servers - Dynamically add workloads to the group when a DB tag is applied to the VM
3-Tier App - This group will have static membership and I will add the Web Servers, App Servers and DB Servers group into it. Nested groups are also a new feature released as part of SDDC v1.9.
On-Premises Management - This group will have static membership which will include the CIDR range of the on-premises management segment (

To create an Inventory Group simply log into the Cloud Services Portal (CSP) and navigate to your SDDC. From here click on the Networking & Security, Groups and then Add Group:

I'm going to call this group Web Servers and give it a suitable description. Click on the Set Members link to set the membership criteria:

I now want to click on Add Criteria and specify Virtual Machine Tag Equals Web and click Apply:

Finally, I can click Save to commit the changes:

I now have my first Infrastructure Group which will dynamically add VMs whenever a Web tag is applied to them. I will now continue and create the App Servers and DB Servers groups based on the App and DB tags.

I'm now going to create my 3-Tier App group with the membership being the Web Servers, App Servers and DB Servers groups:

To nest groups select Members and check the groups that you want to be nested:

Finally, the last group that I need to create is my On-Premises Management group:

This group is going to be based off the CIDR range so select the IP/MAC Addresses option and enter the range:

I now have all the required Infrastructure Groups created and can start building out my Distributed Firewall ruleset:

The Distributed Firewall has been completely overhauled as part of SDDC v1.9 with a ton of great new features. The interface has been updated and now includes rule categories which are evaluated based on priority precedence i.e. Emergency rules will be evaluated before Environment rules:

1 - Ethernet - Applied to all SDDC network traffic
2 - Emergency - Used for quarantine and allow rules
3 - Infrastructure - Define access to shared services. Global rules - AD, DNS, NTP, DHCP, Backup, Management Servers
4 - Environment - Rules between zones - production vs development, inter-business unit rules
5 - Application - Rules between applications, application tiers, or the rules between microservices

Configurations can now be saved and published at a later date/time (Maybe within a change window) and we can also fully exclude VMs from having DFW policies applied. These are just a few of the improvements, remember to check the release notes for more.

One of the huge usability improvements in SDDC v1.9 is the ability to dynamically re-prioritize rules by dragging and dropping them into place. You can also filter rules by name, source, destination or service and can edit rules inline on the UI. This makes creating and modifying DFW rules much more efficient.

We are going to create the required rules for our 3-Tier App and the first thing we need to do is create a new policy within the Applications section. The name of the policy can be modified inline:

Click on the three dots to the right of the rule and click Add Rule:

Give the rule a suitable name, in our example, we want this rule to allow Admin access to the 3-Tier App from the On-Premises Management Network for troubleshooting and support purposes:

Within the Source set it to the On-Premises Management Infrastructure Group:

Set the Destination to be the 3-Tier App group:

Now the services we want to allow are SSH and ICMP (If the service you require is not available you can add custom services):

We can see that our admin access rule is in place but since we have not published the ruleset this is currently not live:

I'm now going to create the complete ruleset for the application based on the following requirements:

Once the ruleset has been created remember to publish it otherwise it will not be realised within NSX.

The final task is to tag our VMs with the correct tags which will ensure they are added to the corresponding infrastructure groups and the correct security policy should be applied. Navigate to the Virtual Machines section and edit one of the VMs:

Tag the VM with the required tag. In our example, this would be either Web, App or DB:

Ensure all VMs in scope are tagged accordingly:

You can check to ensure the VMs have been tagged correctly and they appear in the security group by viewing the group members:

My Web Servers group now has WEB01 as a member:

Using groups and tags has the operation benefit of not having to make any changed to rules if I scale up the application. If I add a new Web Server all I simply have to do it tag it with the Web tag (Ideally through automation) and the correct security policies will be applied to the workload from day one.

Now the Distributed Firewall policies have been created and the VMs tagged let's check to ensure my application is working as expected. If I browse to the Web Server I can see that I can successfully connect to the 3-Tier Application:

From my desktop which is in the On-Premises Management CIDR range, I can successfully ping all workloads:

I can also successfully SSH into the Web Server but notice that I cannot SSH or ping from the Web Server to the Database Server:

VMware Cloud on AWS SDDC v1.9 is an update packed with new features and functionality and just shows the power of the platform in helping customers migrate and secure their workloads with familiar tooling.

Wednesday, 4 December 2019

Monitoring VMware Cloud on AWS vCenter alarms within vRealize Log Insight Cloud

vRealize Log Insight Cloud (vRLIC) gives us unified visibility across public and private clouds through robust log aggregation, analytics and faster root cause determination. The great news is that it is also included as part of your subscriptions to VMware Cloud on AWS (VMC) and as of VMworld Europe 2019 also now includes additional features and functionality:

Check out the official blog article here

With the core version, VMC customers now get access to real-time reporting which is what we are going to look at today. In a previous post, I talked using creating VMC vCenter alarms and setting notifications for specific events. The event we were particularly interested in was if the VSAN datastore reaches 70% utilisation because at 75% a new host will be added to ensure we stay within SLA. In the example, we used an alert that would trigger if the datastore was less than 100% utilised as this would ensure the alert would always trigger. We are now going to use vRLIC to query for the alert and then send us a notification once it has been triggered.

vRLIC is automatically configured to ingest logs for VMC and can be accessed via the Cloud Services Portal so there is nothing that you need to do to start using it, simply launch the application:

The initial landing page gives us a great overview of recent alerts and event observations over the last hour. It is definitely worth spending some time with vRLIC to see the level of information and default alerts that are available to VMC customers:

I've re-created the vCenter alert that triggers when VSAN Datastore Usage is below 100 percent just to ensure the log is sent instantly to vRLIC:

If we explore the logs and query for the alert name VSAN Datastore Usage is below 100 percent with a timeframe of the last ten minutes then we can see the triggered alert. We know this is the alert because we can see it change its state from Gray to Red:

Now that we have the query needed we can click on the save icon:

Give the query a suitable name and description and click Save:

Once we save the query we can click on the alert icon to create an alert based on the query:

Give the alert a suitable name and description and click Save:

The Alert Definition screen will appear which will allow you to customise the alert. Remember to add the Email address where you would like the alert to be sent, set the trigger to evaluate on every match and enable it before clicking on the save icon:

It's also worth sending a test alert to ensure you receive the notification:

Hopefully, if everything is set up correctly next time the alert is triggered you should receive a notification via email and also see it in the Recent Alerts:

With the example above we are triggering an email notification when we see any log that has been ingested and contains the text, VSAN Datastore Usage is below 100 percent. This is not ideal because it will also trigger the notification when any changes to the alarm are made i.e. reset to green or disabling and re-enabling. I tried testing this on alarm name as well as the text gray to red which is sent when the state of the alarm changes but during testing, I noticed that this was not always sent on certain alarm configuration changes which I have fed back to the BU and will be addressed in the future. I don't envisage these changes being made regularly in customer environments so it should not cause an influx of emails.

A point that I would like to highlight is that vRLIC currently runs out of one of the US AWS regions so if there are issues with logs residing outside of the UK/EU then please get in touch and I will continue to raise this internally.

Monday, 2 December 2019

VMware Cloud on AWS vCenter Alarms

A lot of VMware customers use vCenter alarms and notifications for monitoring their on-premises environment and the same goes for when they move to VMware Cloud on AWS. I was recently asked by a customer on how they can receive a notification when the VSAN storage capacity is getting close to 75% full. For those who are not aware we need 25% slack space for VSAN and will automatically add a host once storage utilisation reaches 25%, which is documented in the Service Level Agreement for VMware Cloud on AWS.

Creating an alarm can either be completed directly in the vCenter client or via the Cloud Gateway Appliance. Simply browse to the WorkloadDatastore, select Configure and then Alarm Definitions. From here you need to Add a new alarm:

Give the alarm a suitable Name and Description and click Next:

In the example, I am setting the alarm to be triggered if the utilization is less than 100% (Which will always be the case) to ensure that I receive a notification. Typically you would set this to is above 70% or whatever threshold you feel comfortable with. Once you have the correct parameters enter the email address to whom you would like the notification sent to and click Next:

Set the email notification if you want to be notified once the condition clears otherwise click Next:

Review your settings and click Next when ready:

Since our alarm was set to trigger if storage utilisation was less than 100% we can see that it triggered straight away within vCenter:

We also received an email notification:

Since currently there is no sender address there is a chance this might get picked up by your email spam filtering software so you might have to create a rule to allow it through. Due to this annoyance in the next article, I will show you how to trigger an alert from vRealize Log Intelligence Cloud.