VMWare Memory Management

Summary: Background and practical tips on VMs Memory Management & Performance.
Date: Around 2014
Refactor: 1 May 2025: Checked links and formatting.

vmware

This story is a follow-up on Vmware CPU Management and Performance and will show you how to find performance issues regarding memory.

When troubleshooting memory performance the most important thing to find out is whether your host is having memory issues (as in you overcommitted memory and need to add more memory to the host) or whether the problem is restricted to the VM (as in you need to allocate more memory to the affected VM). If the host is affected you can notice pretty fast because of the predefined VMware Alarms in vCenter. This however, is just an indication of percentage of memory allocation, not whether this is a problem. To dive into the problem, we need performance graphs.

Memory Graph for a Host

Perform the following steps to create a real-time graph for a host's memory usage:

Inside vCenter select a host
Select the Performance tab and change the view to “Advanced”
Click the “Chart Options” link.
Select Memory → Real-Time
Set the “Chart Type” to “Line Graph” and only select the host as an object.
Select these counters:
1. Used by VMkernel
2. Consumed
3. Active
4. Overhead
5. Swap Used
6. Usage
Click OK to see the chart.

Memory Graph Examples

Quiet ESX Host

This is an example of a Memory Graph for an ESX host that is quiet:

Busy ESX Host

This is an example of a Memory Graph for an ESX host that is busy (as in a high usage percentage):

This is an example of a Memory Graph for an ESX host that is busy (and that's ballooning, swapping and compressing memory):

Memory Graph for a VM

Perform the following steps to create a real-time graph for a virtual machine's memory usage:

Inside vCenter select a virtual machine
Select the Performance tab and change the view to “Advanced”
Click the “Chart Options” link.
Select Memory → Real-Time
Set the “Chart Type” to “Line Graph”
Select these counters:
1. Usage
2. Overhead
3. Consumed
4. Granted
5. Active
Click OK to see the chart.

Memory Graph Examples

Quiet VM

This is an example of a Memory Graph for a VM that is quiet:

Busy VM - active

This is an example of a Memory Graph for a VM that is busy (as in a high usage/active percentage):

Busy VM - active - no ballooning

This is another example of a Memory Graph for the same VM that shows that a high usage/active memory doesn't necessarily mean that the VM will start ballooning and swapping:

Busy VM - ballooning

This is an example of a Memory Graph for a VM that is busy and that is ballooning and swapping:

Memory Counters

Used by VMkernel

Amount of machine memory used by VMkernel for core functionality, such as device drivers and other internal uses

Consumed

Amount of memory consumed by a virtual machine, host or cluster

Active

Amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages

Overhead

Memory (KB) consumed by the virtualization infrastructure for running the VM
For each running virtual machine physical memory gets reserved for its virtualization overhead. The amount of overhead depends on the number of the total memory size and the number of virtual CPUs assigned to the VM. For example:

2 vCPUs and 1 GM RAM: 176 MB overhead
8 vCPUs and 32 GB RAM: 1647 MB overhead

Swap Used

Amount of memory that is used by swap

See the Swapping section below for more information about swapping.

Usage

Memory usage as percentage of total configured or available memory

Shared

Amount of guest memory that is shared with other virtual machines, relative to a single virtual machine or to all powered-on virtual machines on a host
See the TPS:_Transparent_Page_Sharing section below for more information about memory sharing.

Balloon

Amount of memory allocated bu the virtual machine control driver (vmmemctl), which is installed with VMware Tools
See the vmmemctl:_Balloon-Driver_Mechanism section below for more information about ballooning.

Compressed

Amount of memory compressed by ESX
When memory is overcommitted and virtual pages need to be swapped, ESX first tries to compress the page. Pages that can be compressed to a kilobyte or less are stored in the VM's compression cache. Accessing compressed memory is faster than accessing memory that is swapped to disk. Usually it will not reduce performance significantly.
Memory compression is enabled by default, and settings can be changed in the advanced options of the ESX host under Mem.MemZipxxx.

Granted

Amount of machine memory or physical memory that is mapped for a virtual machine or a host

Memory Techniques

TPS: Transparent Page Sharing

The VMkernel scans memory pages for duplicate pages and when detected arranges for them to be shared. It treats them like “copy-on-write”. They are read-only while being shared, but return to being private copies after being written.
TPS works by creating a hash table that has a hash value for each page (4KB) of memory. If identical hashes are found, a bit for bit comparison is performed to ensure they really are identical. Obviously, these comparisons take CPU and resources, so by default the duplicate detection is performed over a period of 60 minutes. This means that memory pages become shared over time and that if you plan on suspend/resume or power off/power on VMs on a large scale you won't benefit from this function all that much.
However, when running multiple VMs with the same OS on the same ESX host it will save you a lot of memory on OS pages. Also, Windows Server 2008 allocates it's full configured memory on boot and writes zero's to it. This causes huge memory savings when combined with TPS.

vmmemctl: Balloon-Driver Mechanism

Let me explain this with an example. A host is constrained on memory and needs to get some of that memory back. The host has no idea what memory pages the guest/VM is actively using and which memory pages are free, so it also has no idea which memory pages can be reclaimed from the guest. VMware Tools has a balloon driver, and this driver/process is started on the guest. This driver starts requesting memory pages from the guest in order to “inflate” the balloon with memory pages it doesn’t need. Once the guest allocates the memory to the balloon driver, the driver notifies the hypervisor in order for the hypervisor to reclaim the host physical pages that are backing the balloon driver.
To use ballooning, the guest operating system must be configured with sufficient swap space.

Swapping

ESX/ESXi hosts use swapping to forcibly reclaim memory from a virtual machine when the VMmemctl(balloon) driver is not available or is not responsive.
Swapping is, to put it simply, using a swap file to act as memory/RAM for the machine. Using a swap file as memory for a VM is Much slower than using the physical RAM so it is usually something you don’t want to happen. The guest memory is “swapped” to the swap file instead of physical memory. The host will force the VM to swap when the balloon driver is not able to reclaim the memory.

Swapping will also occur if you set a limit on your memory resources in a VM that is less than the configured memory for the VM. If you use “Edit Settings” → “Resource tab” on a VM, you can set the memory limit, which tells the VM how much host physical memory it may use. If you set that to be less than the configured memory allocation, the VM will make up the rest of the memory in swap. So:

If memory limit < configured memory allocation
- then memory limit(physical) + swap = configured memory allocation.

This can SERIOUSLY impact performance, because in opposite to guest swapping, with kernel swapping there is no way for the OS to decide what will be swapped and what should stay in the memory. You can easily end up with critical memory areas being swapped out to the disk (like services or even OS memory). This is how that looks like in the resource allocation tab:

UPDATE: I just read something about memory counters from VMware (see resources). If you want to see current swapping check the “swap in rate” and “swap out rate”. Swap in rate is the rate at which memory is swapped from disk into active memory during the interval. This counter applies to virtual machines and is generally more useful than the swapin counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics.

NOTE TO SELF: So swap in is from swapped memory to active memory.

Ballooning Versus Swapping

With Reservations

By default, a maximum of 65% of a guests virtual memory can be ballooned, if the reservation is not set too high. But what if, for example, the reservation is set below 35%? The amount of memory that is reserved will never be taken away by ballooning or swapping but what can happen to the part in between? That will be swapped. Consider this example:

VM with reservation set to 30% of the configured memory.
- In times of memory congestion 65% of the memory can be ballooned and 5% will be swapped

Reclaiming Memory

Reclaiming of memory is done by ballooning or swapping. The VMkernel will try to keep 6% free (Mem.minfreepct) of its memory. When free memory is greater or equal than 6%, the VMkernel is in a HIGH free memory state. In a high free memory state, the ESX host considers itself not under memory pressure and will not reclaim memory in addition to the default active Transparent Page Sharing process.
When available free memory drops below 6% the VMkernel will use several memory reclamation techniques as explained above. The VMkernel decides which reclamation technique to use depending on the configured settings. ESX uses four thresholds high (6%), soft (4%), hard (2%) and low (1%). In the soft state (4% memory free) ESX prefers to use ballooning, but if free system memory keeps on dropping and ESX reaches the Hard state (2% memory free) it will start to use swapping.

Note that memory that is ballooned or swapped can be found in the performance tabs under exactly these names. Keep in mind that this means it was transferred to these states somewhere in the past. If you notice these states it does not mean it's still ballooning or swapping. Also note, that as long as these memory pages are not accessed again they will not be swapped out.

This is an example of a VM that is seriously swapping in and out because of a misconfiguration where the limit is set below the configured memory:

This wiki has been made possible by:

Table of Contents

VMWare Memory Management

Memory Graph for a Host

Memory Graph Examples

Quiet ESX Host

Busy ESX Host

Memory Graph for a VM

Memory Graph Examples

Quiet VM

Busy VM - active

Busy VM - active - no ballooning

Busy VM - ballooning

Memory Counters

Used by VMkernel

Consumed

Active

Overhead

Swap Used

Usage

Shared

Balloon

Compressed

Granted

Memory Techniques

TPS: Transparent Page Sharing

vmmemctl: Balloon-Driver Mechanism

Swapping

Ballooning Versus Swapping

With Reservations

Reclaiming Memory