I had a recent discussion with a teammate about VMWare datastores. We are using thin provisioning on a ESXi 4.1 installation backed by IBM XIV storage.
In our previous installation we ran ESX 3.X backed by DS4000 disk. What we found out is that VMs grow like weeds and our datastores quickly filled up. This admin just resized the datastores and we went on our way. A technical VMWare rep afterward mentioned that while it is supported, adding extents to VMFS datastores isn’t necessarily best practice.
When we laid down our new datastores, I wanted to avoid adding extents, so made the LUNs 1 TB. That’s as big as I dared to avoid using extents in datastores, but is probably too big for our little installation.
I noticed that our datastores were getting to about 90% utilized, so I added a new LUN and datastore. When I mentioned in our team meeting that I had added a datastore we had a somewhat heated discussion. My teammate really wanted to resize the LUN and add extents to the datastore. I pointed out that I didn’t think that was the best practice and 3 or 4 datastores isn’t really a lot to manage.
So, why not just use one datastore per storage array? The big argument seems to be that people add a second LUN, then extend the datastore to the new LUN. The down-side of this is that if one LUN goes off-line, all the associated data is unavailable. VMWare will try to keep all the data for each VM on one extent, but it’s not always successful. So, if one LUN goes offline, best case is only some of your VMs are affected. Less ideally, they lose part of their data and more VMs are impacted or are running in a state where some of the storage isn’t available. Or, if the failed LUN is the first LUN (the Master Extent), the whole datastore goes offline. At least the architecture allows for a datastore to survive losing an extent under ideal circumstances.
What’s less apparent is the performance hit of multiple VMs and ESX hosts accessing one large LUN. With a lot of VMs generating I/Os you can exceed the disk queues, which default to 32 operations per LUN, for the datastore. Adding more LUNs to the datastore DOES increase the number of queue slots for the whole datastore. And that would be a good thing, assuming the data is equally distributed across all the LUNs, which is not going to be the case.
And, similar to inode locking in a filesystem, shared storage has to contend with volume locking. Multiple systems can read from the same LUN with no problem. But when a write occurs, the volume is locked by one host until the write is committed. Any other host trying to do a write gets a signal that there is a lock and has to wait for the lock to be released. On modern disk arrays, with write caching, this should be very fast; but it’s not ideal.
So, to avoid write locking you can try to keep all your servers on one datastore. But, that’s not really practical long-term as VMs get migrated between hosts. Or, you can minimize the number of VMs that are using each datastore. In addition to keeping the number of VMs/datastore low, a strategy to consider is to mix heavy I/O VMs with VMs that have low I/O requirements; which will help manage the queue depth for each LUN.
How many VMs is too many per datastore? Depends on your environment. I’ve seen recommendations ranging from 12 to 30. If you have a lot of static web servers that don’t do any real writes, you can get away with lots. If you have Oracle or MS SQL servers that do a lot of I/O, including writes, keep the numbers low. You can log into the ESX host and run exstop and hit “u”. There are lots of interesting fields in here. CMDS/s, READS/s, WRITES/s, and so on. Check the QUED field to see the current number of queu slots in use.
A good rundown on this is Mastering VMware VSphere 4. Recommendations from the book: single extend VMFS datastores per LUN, don’t add extents just because you don’t want to manage another datastore, but go ahead and span a VMFS datastore if you need really high I/O or really big datastores.
I have another take on it. Always use one LUN per datastore. The argument that datastores backed my multiple LUNs give better performance is a little flawed because VMWare tries to allocate all the storage associated with one VM on one extent. If you need high I/O assign a LUN from each datastore, then separate the data logically on the VM. You get to leverage more disk queue slots by bringing in more LUNs per VM, the datastores are a single LUN which is easy to manage and maintain, and LUN locking is less of an issue with smaller datastores. And, while you do end up with more datastores, it’s not that big of a deal to manage.
The down-side, and there usually is one, is that you’re back to relying on more pieces that could go wrong. If you spread the data across multiple datastores, and a datastore goes offline, that VM is impacted. It’s really about the same exposure you have with using multiple LUNs per datastore. If the LUN your data is on goes down, your data is unavailable. So plan your DR and availability schemes accordingly.