Tag Archives: xenserver

Improving Logon Time with PVS Accelerator


The title is correct.  We can improve user logon time by implementing PVS accelerator in XenServer 7.1.

This actually makes perfect sense.

We already showed that PVS Accelerator drastically improves VM boot times because portions of the master vDisk image are cached locally.  Booting a VM equates to roughly 80% reads and 20% writes.  VMs using the same image are reading the same blocks of data. Due to this similarity, we are able to see huge network utilization reductions by using the PVS Accelerator cache. These reductions in the network utilization translates into faster boot times.

But what about logon time?

Continue reading Improving Logon Time with PVS Accelerator

Advertisements

Provisioning Services Accelerator


An interesting new feature was included with the XenServer 7.1 release: Provisioning Services Accelerator.

In a single sentence,

PVS Accelerator overcomes PVS server and network latency by utilizing local XenServer RAM/Disk resources to cache blocks of a PVS vDisk to fulfill local target VM requests.

Take a look at the demo video to see:

Continue reading Provisioning Services Accelerator

Machine Creation Services RAM Cache and XenServer IntelliCache


As I was discussing the storage optimization capabilities in the Machine Creation Services vs Provisioning Services debate, I mentioned the use of a XenServer RAM-based read cache. This can be misunderstood as XenServer IntelliCache (a mistake I’m sad to say I’ve made in the past).

XenServer IntelliCache (released with XenServer 5.6 SP1) and XenServer RAM Cache (released with XenServer 6.5) are two different capabilities of XenServer, both of which tries to reduce the  IO impact on shared storage.

Let’s walk through different deployment scenarios with Machine Creation Services in XenApp and XenDesktop 7.9.

Scenario 1: Shared Storage on any Hypervisor

When defining a host connection, the default storage management option is to use shared storage.

DefaultThis configuration results in the following architecture

Default Storage

The virtual machines read (denoted by the black lines) from the master OS disk on shared storage. Writes (denoted with the red lines) from the virtual machine are captured in the differencing disk, located on the same shared storage as the master OS disk.

It is also important to note that there will also be reads coming from the virtual machine’s differencing disk back to the VM.

Scenario 2: Shared Storage with Optimized Temp Data on any Hypervisor

With XenApp and XenDesktop 7.9, admins, when creating their host connection configuration, can select the “Optimize temporary data on available local storage” option.

Storage ConfigSelecting this option results in the following changes to the architecture:

Opt Storage

For non-persistent desktops, instead of the temporary writes going into the shared storage differencing disk, the writes are now captured within the write cache disk on the local hypervisor storage.

Persist optBut for persistent desktops, the optimize temporary data setting is not used as all data is permanent. The writes are captured on shared storage within the differencing disk.

The value is that we don’t waste shared storage performance with data we don’t care about. We instead shift the storage IO to local, inexpensive disks.

Scenario 3: Shared Storage with Optimized Temp Data and RAM Caching on any Hypervisor

With XenApp and XenDesktop 7.9, a portion of the virtual machine’s RAM can be used to cache disk writes in order to reduce the impact on local/shared storage. During the creation of a new machine catalog, an admin defines the size of the RAM and disk cache.

CacheThe RAM cache operation adjusts the architecture as follows

mcs cacheFor non-persistent desktops with a RAM-based write cache on a non-persistent desktop, the writes first go into the non-paged pool portion of RAM within the VM. As the RAM cache is consumed, the oldest data is written to the temporary disk-based write cache.

However, this option is not applicable for persistent desktops due to the risk of data loss. If disk writes are cached in volatile RAM and the host fails, those writes will be lost, potentially resulting in lost data and corruption.

For non-persistent desktops, when used in combination with optimizing temporary data, we not only shift our write performance to low-cost local disks, but we also reduce the amount of write IO activity going to those disks.  This should further help reduce the costs by not requiring the use of local SSDs.

Scenario 4: Shared Storage with Optimized Temp Data and RAM Caching on Citrix XenServer

If the environment is deployed on Citrix XenServer, the architecture automatically includes a RAM-based read cache on each host.

Non-persistent on XenServer

Persistent desktopFor both, non-persistent and persistent desktops, portions of the master OS image is cached within the XenServer’s Dom0 RAM so subsequent requests are retrieved from the local RAM instead of generating read IOPS on shared storage.

This is valuable because we significantly reduce the master image reads from our shared storage. If you have 50 XenServer hosts, with each running 100 Windows 10 virtual machines, each virtual machine will read the same data from the same master image. This will add significant amounts of read IO activity on shared storage. By caching the reads in local RAM for each XenServer host, we can drastically reduce our impact on shared storage.

We also have a RAM-based read cache in Provisioning Services.  This capability increased boot performance by 4X.  I would expect to see similar results with this XenServer feature.

Scenario 5: Shared Storage with Optimized Temp Data and RAM Caching on Citrix XenServer with XenServer IntelliCache

When the admin defines the host connection properties, Studio includes the IntelliCache option if the host connection is XenServer.

XS ICFor non-persistent and persistent desktops, a local, disk-based cache of the master OS image is captured on each XenServer host, reducing read IOPS from shared storage. As items are accessed, they are placed within XenServer’s RAM-based cache.

The write operations are different based on whether the desktop is non-persistent or persistent.

Non-persistent with IntelliCacheFor non-persistent, disk writes are first captured in the VM’s RAM cache. When the RAM cache is consumed, the oldest content is written to the local write cache.

Persistent with IntelliCacheFor persistent desktops, disk writes are simultaneously captured in the local IntelliCache disk (.VHDCache file in /var/run/sr-mount) and in the shared storage differencing disk. When the VM reads data from disk, it first checks the local IntelliCache disk and then the shared storage differencing disk.

The value for this configuration is two-fold:

  1. Host-based IntelliCache Disk: Using IntelliCache with the Read Cache (RAM) provides us with two levels of caching on XenServer.  This could help reduce reads from shared storage in situations where our Read Cache (RAM) is not large enough.  Imagine if we have multiple images being delivered to each XenServer host. Our read cache (RAM) will not be large enough, resulting in increase read IO activity on shared storage. By combining the two, we should be able to keep shared storage Read IO activity to a minimum.
  2. VM-Based IntelliCache Disk: For persistent desktops, even though each write is performed twice (local IntelliCache disk and differencing disk on shared storage), the reads will come from the local IntelliCache disk, thus helping to reduce the load to shared storage. How much will this help the user experience and cost?  That is still to be determined.

Daniel (Follow on Twitter @djfeller)
XenApp Advanced Concept Guide
XenApp Best Practices
XenApp Videos

PVS vs. MCS – Part 3: Storage Optimization


This is part of a series comparing Provisioning Services and Machine Creation Services

 

For years, storage optimization has been one of the major strengths of Provisioning Services. With PVS, we can do the following:

  1. Optimized temporary storage allocation: PVS allows us to store the read-only master image on local or shared storage. We can also decide where to place the temporary, write disk, which could be on the PVS server’s local storage, the hypervisor local storage, the hypervisor shared storage, within the virtual machine’s RAM, or a combination of RAM and hypervisor local storage.
  2. Read IOPS Optimization: By automatically utilizing Windows 2012R2 system cache, we can drastically reduce read IOPS from the master image.       This has been shown to drastically decrease VM boot time.
  3. Write IOPS Optimization: By utilizing a combination of RAM and local storage for the write cache, PVS can significantly reduce write IOPS going to the hypervisor’s storage. This helps reduce costs as well as improve the user experience.

The write IOPS optimizations are powerful for any deployment because of the impact it has on the user experience while helping to reduce the cost of VDI storage.

But where does this leave Machine Creation Services?

If you believe Machine Creation Services is severely lacking in these capabilities, the latest release might surprise you.

Storage Location

Historically, Machine Creation Services utilized a differencing disk to store the writes. One limitation with the differencing disk approach was that the disk must reside on the same storage as the master image.

Default StorageIf you used shared storage to host your master images, you were also required to place all of your writes on the shared storage. This can drive up your costs.

With the 7.9 release of XenApp and XenDesktop, the differencing disk is transformed into a write-backed cache disk. This transformation allows the writes to be separated from the master image storage location.

Opt StorageWhen shared storage is used for the master image, the temporary storage (writes) can then be stored on the hypervisor local storage. This is configured as part of the host connection configuration within Citrix Studio.

Storage ConfigRead IOPS Optimization

Even though the writes are stored locally on the hypervisor, shared storage is still used for Read IOPS as each virtual machine on each hypervisor must read from the same set of images on shared storage.

Remember that PVS utilizes a RAM-based read cache to reduce Read IOPS from storage; when using XenServer, Machine Creation Services implements similar functionality.

RAM Read CacheA portion of XenServer RAM is used to locally cache portions of the master OS disk.

Write IOPS Optimization

I believe the Write IOPS optimization is the biggest enhancement for Machine Creation Services because of the impact the similar write IOPS optimization technology had on Provisioning Services with respect to storage cost and user experience.

With Machine Creation Services, each virtual machine utilizes a portion of their non-paged pool memory for the Machine Creation Services RAM cache.

MCS RAM CacheAs the virtual machine begins writing data to disk, those operations are stored within the RAM cache. Eventually, the RAM cache will get consumed and the oldest cached data will be written to the write-backed disk cache in 2 MB blocks.

This process is similar to how Provisioning Services handles the RAM-based write cache with disk overflow, which significantly reduced write IOPS.

XenApp and XenDesktop 7.9 gives enhances Machine Creation Services with

  1. Optimized temporary storage allocation
  2. RAM-based Read IOPS optimization
  3. RAM-Based Write IOPS optimization

So in the comparison of PVS and MCS, where does that leave us now?

storage compageAgain, things are fairly even.

Daniel (Follow on Twitter @djfeller)
XenApp Advanced Concept Guide
XenApp Best Practices
XenApp Videos

Sizing XenApp Windows 2012R2 Virtual Machines


I guess I’m not done yet.

Last week, I posted the latest recommendations on sizing Windows 10 and Windows 7 virtual machines for a XenDesktop environment.  I received a few replies from people asking for any updates regarding Windows 2012R2.

Unfortunately, when we discuss Windows 2012R2 and XenApp, the recommendations are not as straightforward as Windows 10 and Windows 7.

  1. Because Windows 2012R2 will do session virtualization (where many users share the same VM but get a separate session) it makes sizing CPU and RAM more difficult.
  2. Because we can publish multiple resources from the same VM, we can have a mix of light, medium and heavy users on the same VM at the same time.
  3. Because each VM will host multiple users, our VMs will be sized larger when compared to Windows 10 and Windows 7. To size correctly, we need to align our recommendations with the nuances of the hardware.

Let’s take a look at the latest recommendations before we go into more detail.

Win12RwSizingvCPU

For vCPU, you notice it is based on NUMA.  What is NUMA?  I recommend you read these two blogs by Nick Rintalan.

  1. An intro to NUMA
  2. A Discussion about Cluster on Die

To summarize, you get the best physical server density when you have the same number of vCPU for your XenApp VMs with either the number of cores within a NUMA node or 1/2 of a NUMA node.  If you go with 1/2 of a NUMA node, then you will just have two times as many VMs.

Cluster on Die is a little more complex as newer hardware chips don’t have equal sized NUMA nodes across cores.  Cluster on Die is a BIOS option that balances cores equally by creating clusters of cores.

RAM

Sizing RAM is also a little different than when comparing it to Windows 10 and Windows 7. With session virtualization, like XenApp, all users share the same OS instance. Users also share the same application instances. The OS and app instances only consume RAM once. That is a huge reduction in overall RAM usage, which is why the RAM recommendations are significantly lower than the desktop OS.

Of course, the amount of RAM you allocate is going to be based on the specifics of your applications.

PVS RAM Cache

Just like with Windows 10 and Windows 7 recommendations, the PVS RAM cache is extremely valuable in a Windows 2012R2 XenApp environment.  With PVS RAM Cache, we see huge reductions in IOPS for Windows 2012R2.

Daniel (Follow on Twitter @djfeller)
XenApp Advanced Concept Guide
XenApp Best Practices
XenApp Videos

Sizing Windows 10 and Windows 7 Virtual Machines


After reviewing all of the scalability tests we conducted over the past few months, I thought it was time to revisit the recommendations for sizing Windows 10 virtual machines.  I also reached out to Nick Rintalan to see if this is in line with what is currently being recommended for production environments (if you disagree, blame him 🙂 ).

Win10Sizing

A few things you will notice

  1. Windows 7 and Windows 10 recommendations are similar.  Microsoft’s resource allocation for both operating systems are similar.  The Windows 10 and Windows 10 scalability tests resulted in similar numbers.
  2. Density – Experience: For some of the recommendations, there are 2 numbers. The first is if you are more concerned with server density and the second is if you are more concerned with the user experience.  What I find curious is if you have a heavy workload, are you as concerned with server density?
  3. PVS RAM Cache: Using the RAM cache will drastically reduce storage IOPS.  This will be critical to providing a good user experience and will be taken from the total allocated RAM.  The RAM column takes the RAM Cache numbers into account.
  4. Hypervisor: There is no hypervisor identified.  Testing showed minor differences between XenServer, Hyper-V and vSphere.

Daniel (Follow on Twitter @djfeller)
XenApp Advanced Concepts Guide
XenApp Best Practices
XenApp Videos

XenDesktop 7.7 and Windows 10


The other day, I was able to share the latest single server density results when running Windows 7 on XenDesktop 7.7. We looked at a range of parameters like:

      • PVS vs MCS
      • PVS Disk Cache vs RAM Cache
      • Citrix Policies: Very High Definition vs High Server Scalability vs Optimized for WAN
      • Windows 7 optimizations

Once that testing was complete, we moved onto the next version… Windows 10. An again, looking at the exact same parameters.

First, we look at Microsoft Hyper-V 2012R2

HVwin10

Second, we look at Citrix XenServer 6.5 SP1

xswin10

What do you notice?

Between XenServer & Hyper-V… Not much

Between MCS and PVS… Not much as the 5-6% gains in MCS would be offset by increased storage costs due to lack of RAM Caching capabilities

Between the different policies… Around a 7-8% improvement

Between OS optimizations… Around a 7% improvement

The last part I find very interesting.

If you recall, I recently posted a blog (Windows 10 Optimization: The Results) showing that the Optimized OS config, based on the numerous Windows 10 optimizations, showed a 20% improvement in density. As we look at this expanded series of tests, what I now see is something rather interesting. Simply utilizing the Citrix policy templates, we achieve 1/2 of that density gain.

And I can tell you from experience that implementing the Citrix policies are much easier than working through all of those Windows 10 optimizations

So my advice, definitely use the Citrix policy templates as your starting point. If you want to know more about them, I suggest the following:

Server Specs:

      • Processors: Dual Intel Xeon E5-2697 (12 cores each)
      • RAM: 384 GB
      • Workload: Knowledge Worker
      • VM Specs: 2vCPU, 4 GB RAM

 

Daniel (Follow @djfeller)

XenApp Best Practices

XenApp Videos