Tag Archives: Sizing

XenServer PVS Accelerator Cache Sizing


How large should we make our PVS Accelerator cache? Too large and we waste resources. Too small and we lose the performance.

Let’s take a step back and recall our best practice for sizing the RAM on Provisioning Services.  We would typically say allocate 2GB of RAM for each vDisk image the server provides.  This simple recommendation gives the PVS server enough RAM to cache portions of the image in Windows system cache, which reduces local read IO. So for a PVS server delivering

  • 1 image:  we would allocate 2GB of RAM (plus 4GB more for the PVS server itself)
  • 2 images:  we would allocate 4GB of RAM (plus 4GB more for the PVS server itself)
  • 4 images:  we would allocate 8GB of RAM (plus 4GB more for the PVS server itself)

Easy.

Let’s now focus on the XenServer portion of PVS Accelerator. If we use RAM as our PVS Accelerator cache, how many GB should we allocate?

I decided to test this out.  I first set my PVS Accelerator cache to use 2GB of RAM.

Once configured, I booted and logged into a single VM. I then started a second VM on the same XenServer host.

Let’s first look at the Windows Server 2012R2 results:

When the first VM starts, the cache is empty so while the VM boots and a user logs in, the cache gets populated. When the second VM starts, it doesn’t increase the size of the cache because everything it needs to boot and log on a user has already been cached.

The expectation is that Windows 10 should show a similar behavior.

The difference between Windows 10 and Windows 2012R2 in regards to PVS Accelerator is the amount of cache consumed during bootup and logon.  Windows 10 uses significantly more cache than Windows 2012R2.

Windows 2012R2 uses 40% while Windows 10 using 80%.  So based on this, what should we allocate per XenServer host?  2GB? 1.5? 2.5?

First, remember that the boot process is weighted heavily on reads. The read/write ratio is close to the following:

  • Boot: 80% Reads : 20% Writes
  • Logon: 50% Reads : 50% Writes
  • Steady State: 10% Reads : 90% Writes

Second, these graphs only show boot and logon.  Users haven’t loaded any applications, which will initiate read operations, thus increasing the cache utilization.

Third, we have a limited supply of RAM.  Our goal isn’t to eliminate all reads, it is to significantly reduce them.

Based on that, I’d start with the following and MONITOR:

  • Windows 10: 2.5GB per image per host
  • Windows 2012R2: 2GB per image per host

Remember, the key is “per image per host”.

The PVS Accelerator cache is host specific. If my XenServer host is supporting VMs using 4 different images, I need my PVS Accelerator cache to be large enough to cache enough of the read IO to be beneficial across all 4 images.

If our environment is large enough, we will want to look at segmenting our XenServer hosts into groups of servers that host a few images. That way we can reduce RAM allocation for PVS Accelerator.

Daniel (Follow on Twitter @djfeller)
Citrix XenApp and XenDesktop 7.6 VDI Handbook
XenApp Best Practices
XenApp Videos

Advertisements

Sizing XenApp Windows 2012R2 Virtual Machines


I guess I’m not done yet.

Last week, I posted the latest recommendations on sizing Windows 10 and Windows 7 virtual machines for a XenDesktop environment.  I received a few replies from people asking for any updates regarding Windows 2012R2.

Unfortunately, when we discuss Windows 2012R2 and XenApp, the recommendations are not as straightforward as Windows 10 and Windows 7.

  1. Because Windows 2012R2 will do session virtualization (where many users share the same VM but get a separate session) it makes sizing CPU and RAM more difficult.
  2. Because we can publish multiple resources from the same VM, we can have a mix of light, medium and heavy users on the same VM at the same time.
  3. Because each VM will host multiple users, our VMs will be sized larger when compared to Windows 10 and Windows 7. To size correctly, we need to align our recommendations with the nuances of the hardware.

Let’s take a look at the latest recommendations before we go into more detail.

Win12RwSizingvCPU

For vCPU, you notice it is based on NUMA.  What is NUMA?  I recommend you read these two blogs by Nick Rintalan.

  1. An intro to NUMA
  2. A Discussion about Cluster on Die

To summarize, you get the best physical server density when you have the same number of vCPU for your XenApp VMs with either the number of cores within a NUMA node or 1/2 of a NUMA node.  If you go with 1/2 of a NUMA node, then you will just have two times as many VMs.

Cluster on Die is a little more complex as newer hardware chips don’t have equal sized NUMA nodes across cores.  Cluster on Die is a BIOS option that balances cores equally by creating clusters of cores.

RAM

Sizing RAM is also a little different than when comparing it to Windows 10 and Windows 7. With session virtualization, like XenApp, all users share the same OS instance. Users also share the same application instances. The OS and app instances only consume RAM once. That is a huge reduction in overall RAM usage, which is why the RAM recommendations are significantly lower than the desktop OS.

Of course, the amount of RAM you allocate is going to be based on the specifics of your applications.

PVS RAM Cache

Just like with Windows 10 and Windows 7 recommendations, the PVS RAM cache is extremely valuable in a Windows 2012R2 XenApp environment.  With PVS RAM Cache, we see huge reductions in IOPS for Windows 2012R2.

Daniel (Follow on Twitter @djfeller)
XenApp Advanced Concept Guide
XenApp Best Practices
XenApp Videos

Sizing Windows 10 and Windows 7 Virtual Machines


After reviewing all of the scalability tests we conducted over the past few months, I thought it was time to revisit the recommendations for sizing Windows 10 virtual machines.  I also reached out to Nick Rintalan to see if this is in line with what is currently being recommended for production environments (if you disagree, blame him 🙂 ).

Win10Sizing

A few things you will notice

  1. Windows 7 and Windows 10 recommendations are similar.  Microsoft’s resource allocation for both operating systems are similar.  The Windows 10 and Windows 10 scalability tests resulted in similar numbers.
  2. Density – Experience: For some of the recommendations, there are 2 numbers. The first is if you are more concerned with server density and the second is if you are more concerned with the user experience.  What I find curious is if you have a heavy workload, are you as concerned with server density?
  3. PVS RAM Cache: Using the RAM cache will drastically reduce storage IOPS.  This will be critical to providing a good user experience and will be taken from the total allocated RAM.  The RAM column takes the RAM Cache numbers into account.
  4. Hypervisor: There is no hypervisor identified.  Testing showed minor differences between XenServer, Hyper-V and vSphere.

Daniel (Follow on Twitter @djfeller)
XenApp Advanced Concepts Guide
XenApp Best Practices
XenApp Videos

Sizing XenDesktop 7 App Edition VMs


In the Mobilizing Windows applications for 500 users design guide, we made the recommendation to allocate 8vCPUs for each virtual XenDesktop 7 App Edition host (formerly known as XenApp). Spreading this out across a server with two Intel Xeon E5-2690 @2.9GHz processors and 192 GB of RAM, we were yielding about 200 users per physical server and roughly 50 users per virtual server.

Of course, the design guide is the end result of a lot of testing by the Citrix Solutions Lab. During the tests, we had the Solutions Lab compare many (and I mean many) different configurations where they changed the number of vCPU, RAM size, and RAM allocation (dynamic/static) as well as a few other things. We ended up with the following:

A few interesting things:

  1. Dynamic vs static RAM in Hyper-V appeared to have little, if any, impact on overall scalability. The only time when the RAM allocation had a negative impact was when not enough RAM was allocated (no surprise there).
  2. The 8vCPU and the 4vCPU configurations resulted in very similar user concurrency levels. Get ready… The battle is about to begin as to whether we should use 8 or 4 vCPU. (Is anyone else besides me having flashbacks to 2009?)

A few years ago, we debated about using 2vCPU or 4vCPU for XenApp 5 virtual machines. A few years later, the debate is resurfacing but this time, the numbers have doubled: 4 or 8. Here is what you should be thinking about… VMs are getting bigger because the hardware is getting faster, RAM is getting cheaper and the hypervisors are getting better (Nick Rintalan provided a pretty good overview on some of the reasoning for this during his discussion on NUMA cores in his XenApp Scalability v2013 Part 2 blog). The whole point is that none of this is new. It has been going on for a long time and will continue to do so.

The hypervisor helped us reduce our physical footprint by allowing us to better utilize our physical hardware. Now, with better technology, we are able to reduce our virtual footprint because the technology is able to support larger VM sizes.

Daniel – Lead Architect