(Re)claim your space!

In one of my previous posts, Is Bitlooker from Veeam a game-changer?, I wrote about the benefits of using Bitlooker for backup jobs when using Veeam Backup & Replication v9.x however Bitlooker is a feature that is not only available for backup jobs – you can use it for replication jobs as well.

So I thought it’d be fun to see what difference, if any, it makes. The goal of my tests is to figure out the most effective way of copying/replicating a VM from one host to another.

The set up for the test:

A virtual machine is installed with Windows Server 2016 standard edition. 100 GB disk assigned to the VM, thin provisioned. The disk is then filled with files (a bunch of iso-files of different sizes). That’s the baseline, then roughly 85 GB will be deleted (all the added iso-files) – then trashcan will be emptied. So we’ll have some blocks containing stale/old data, the blocks are marked as available to be reused from the operating system point of view but they haven’t been zeroed out so from any hypervisor (outside the VM) it just looks as any other block containing data.

Operating system installed (Windows Server 2016) and updated. The VM now consumes 13,5 GB worth of storage.

Then a bunch of files were added (almost) filling the entire disk.

From the vSphere side of it:

At this point the just added files were removed and trashcan emptied.

And from vSphere:

Now the command ‘ls’ will not show the actual size, so ‘du’ can be used instead to see the actual size of the vmdk file:

I’m going to test 4 different scenarios:

  1.  Migrate the VM from one host to another offline (VM will be shutdown).
  2. Replicating the virtual machine with VMware vSphere Replication 6.5.
  3. Replicating the virtual machine with Veeam Backup & Replication without Bitlooker enabled.
  4. Replicating the virtual machine with Veeam Bitlooker enabled.

The thesis, or point to prove, is that test 1-3 will have no or little impact on the size of the vmdk file however – magic will happen on test 4. So lets perform the tests and find out for real!

Test 1:

VM moved to another host while offline and now let’s explore what can be seen using different methods.

Inside the VM:

From the host:

So no change in vmdk file size as expected.

Test 2:

The virtual machine will be replicated to another host using VMware vSphere Replication 6.5.

VMware vSphere has been configured using the following settings:

Not alot of settings, in fact, the above settings will have no impact on the vmdk size. They will only have control how the snapshot on the VM will be generated (crash consistent vs consistent backup) and the impact the replication job will have on the network.

Inside the VM:

From the host:

Since ‘ls’ doesn’t show the actual size on a thin disk, disk usage ‘du’ is used instead:

So no change in vmdk file size as expected.

Test 3:

The virtual machine is replicated to another host using Veeam Backup & Replication v9.5.

Replication from a Veeam perspective has been set up, to make a fair comparison to the VMware replication test (test 2), the Veeam job will not use exclude swap file blocks:

Processed and read data in the picture below tells us that Veeam doesn’t know the difference between blocks in use and blocks marked as deleted (the same applies for almost all backup vendors):

Inside the VM:

From the host:

And using ‘du’:

So no change in vmdk file size as expected.

 

Test 4:

Know time for the fun stuff. The virtual machine will be replicated to another host using Veeam Backup & Replication v9.5. We will use both space saving techniques we can enable on the job (with application-aware processing we can also exclude specific files, folders, file extensions but we’re not using that feature in this test)

Now, this is the magic we were looking for! The  proxy server has processed all of the data but it has only read data that contain used blocks!

From the VM:

From the host:

Now the vSphere web client combine the .vmdk and -flat.vmdk file into one (like it’s done forever):

And the disk usage utility shows:

Yikes! That’s cool stuff!

Conclusion:

Bitlooker is a feature you should have enabled on any relevant job. It certainly can be used to reclaim that precious storage space you so desperately need.  Heck, why no use it as part of your normal failover testing, cause you’ll already doing that right? Once a month (or how often you feel appropriate) do a planned failover using Veeam Backup & Replication, verify that you DR plan works and as an added bonus you reclaim disk space in the process!

And yet another benefit is the spent replicating the virtual machine, without Bitlooker it took 30 minutes to replicate the VM from one host to another but it was just shy of 7 minutes with Bitlooker enabled.

So seriously, why are you not using this magic thing? There’s only one drawback, Bitlooker supports only NTFS file system (=Windows VMs).

The unsung heroes of DRS in vSphere 6.5

VMware vSphere 6.5 has been out for a while now. Some features didn’t get as much publicity as it should, at least in my mind. A few of the really cool features that were introduced in vSphere 6.5 related to DRS, opening up for additional configuration and control of your cluster and virtual machines. It’s found under DRS settings and is called ”Additional Options” where you’ll see 3 new settings:

  • VM distribution

Under normal circumstances DRS will load balance VMs so each virtual machine has the resources it requires, taking into account the cost of vMotioning workloads around to new hosts (as a side note, CPU and RAM has been used since inception of DRS but now it’s also network aware) as one decision-making parameter. This will result in a cluster where you normally won’t see an exact split/distribution/load balancing of the virtual machines cross the hosts, some hosts may be running more VMs than the others. That’s not a problem since the VMs have the resources they need, they are not starved for resources, but the user perception of the DRS functionality is that VMs will be split equally across all hosts. One concern that might require some rethinking of that approach is availability, what if the host running the majority of VMs crash? HA will certainly restart the VMs but a lot of services will be impacted in the environment – Enter VM distribution!

The purpose of VM distribution is to have a fair distribution of VMs on your hosts. Or as it says on one of the official VMware blogs: ”This will cause DRS to spread the count of the VMs evenly across the hosts.”

However, DRS will always prioritize load balancing over the VM spread, so even distribution of VMs is done on a best-effort basis.

 

  • Memory Metric for Load Balancing

DRS will mainly consider active memory when load balancing, as opposed to consumed or granted memory to VMs. DRS will also take some overhead memory into consideration, so it will use active memory + a 25 % overhead as its primary metric. Active memory is a ”mathematical model that was created in which the hypervisor statistically estimates what Active memory is for a virtual machine by use of a random sampling and some very smart math.  Due diligence was done at the time this was designed to prove the estimation model did represent real life.  So, while it is an estimate, we should not be concerned about its accuracy.”

A really good read on the subject, and where the quote above came from, is “Understanding vSphere Active Memory”.

Now it’s possible to change what memory metric that is used when load balancing, instead of active memory we can use consumed memory. And by the way, the definition of consumed memory: ”Amount of guest physical memory consumed by the virtual machine for guest memory. Consumed memory does not include overhead memory. It includes shared memory and memory that might be reserved, but not actually used.

Virtual machine consumed memory = memory granted – memory saved due to memory sharing” – Quote from “Memory Performance Counters – An Evolved Look at Memory Management”.

This option is equivalent to setting the existing cluster advanced option PercentIdleMBInMemDemand with a value of 100.

 

  • CPU Over-Commitment

It’s now possible to enforce a specific level of overcommit of vCPU to pCPU resources in the cluster, if you only want to allow 4 virtual CPUs per physical core in your cluster you can now have it configured and enforced by the cluster. When the cluster runs out of pCPU resources you won’t be able to power on any more virtual machines. You define a percentage of over commitment of the entire cluster, setting it from 0 % (= no over commitment allowed or put in another way: 1 vCPU = 1 pCPU) up to 500 %.