If you manage your own SAN, you’ll eventually be asked questions like “Why are some of my databases slow?”, “Why do we periodically have performance problems?” or “Do we have a hot LUN?”. Modern arrays have real-time performance monitoring, but not all of them have historical data so you can see if there’s a periodic performance issue or if the current performance is out of the ordinary. There are vendor supplied products and lots of third party products that let you gather performance statistics, but they’re usually pretty expensive. If you just need to gather and report on the performance data for IBM V7000, SVC, or DS8000 storage, there is a great FREE product call Stor2RRD.
Stor2RRD is developed by XORUX, the developers of the excellent Lpar2RRD tool, and is free to use with relatively modest fees for support. As it’s name suggests, it collects data from your storage arrays and puts the data into RRD databases. It has much the same requirements as Lpar2RRD, a simple Linux web server with PERL and RRD, and you can run it on the same server as LPAR2RRD. If you have a DS8000 array, you’ll also need the DSCLI package for your storage, or just SSH if you have an SVC or V7000 storage array.
We had issues getting version 0.45 to work. But the developers responded to a quick Email with a preview of the next version, 0.48, which fixed the problem. The setup was pretty simple, we didn’t have any problems with the provided directions, and got everything setup and tested in an couple of hours.
After running the tool for a couple of weeks, we’ve collected what seems like a lot of data. Some of the high-level graphs are very busy, so much that it runs the risk of being “data porn”, data for data’s sake that loses some of it’s usefulness. But, you can drill down from these high-level graphs to the Storage Pool, MDisk, LUN, drive or SAN Port level and get details like IOPS, throughput, latency and capacity.
For instance, here is a graph if the read performance for the managed disks in one of our V7000’s:
That sure looks like mdiskSSD3, the teal blue one, is a hot array. Here is the read response time for that particular mdisk:
The response time isn’t too bad on that array, 3ms Max and 1.4ms on average, which for this data is more than fast enough.
This is just one simple example of the data that Stor2RRD collects. With this data we have real information showing if a system’s slowness is because the server is using an abnormal amount of bandwidth or if we should consider adding more SSD to an over-subscribed pool. And that helps us make intelligent storage decisions and backup our reasoning with real numbers.
For the cost of a small Linux VM, you can deploy a troubleshooting and monitoring tool the rivals some very expensive third party products. And, if it’s helpful in your environment, Stor2RRD annual support is a fraction of the cost of other products.
There is a full featured demo on the Stor2RRD website where you use the tool yourself with the developers data.