Stor2RRD Overview

If you manage your own SAN, you’ll eventually be asked questions like “Why are some of my databases slow?”, “Why do we periodically have performance problems?” or “Do we have a hot LUN?”. Modern arrays have real-time performance monitoring, but not all of them have historical data so you can see if there’s a periodic performance issue or if the current performance is out of the ordinary. There are vendor supplied products and lots of third party products that let you gather performance statistics, but they’re usually pretty expensive. If you just need to gather and report on the performance data for IBM V7000, SVC, or DS8000 storage, there is a great FREE product call Stor2RRD.

Installing the XIVGui on Fedora 16

I’ve been running the XIVGui on a Windows7 VM so that I have it available from anywhere. That does work, but then I have to launch an rdesktop session, login, then launch the XIVGui, and login again. I finally got tired of the extra steps and decided to load the XIVGui when I upgraded to Fedora 16. I considered making an RPM, but I’m sure IBM would frown on redistributing their code. These manual steps work great on Fedora 16, should work fine on Fedora 15. I haven’t tested it with RHEL or other versions.

NPIV N-Port changes w/ AIX

I was at a meeting with other storage admin’s where they talked about never using NPIV with AIX servers because AIX can’t handle it if the N-Port changes due to a N-Port failover in AG mode. I’ve never seen that. In testing our AIX boxes handled the failover without any problems. Part of the reason may be because I’ve enabled Dynamic Tracking and Fast I/O Failover on these fibre adapters. Dynamic Tracking allows for N-Port ID changes, and Fast I/O Failover makes the failure of a Fibre adapter happen faster, which can be good if you are using a multi-path driver. It’s a simple change, but requires either a reboot or bringing down the adapter for the changes. Here’s the command to make the changes in the ODM, which will be applied the next time you reboot:

AIX – Remove failed MPIO paths

Here is a quick and dirty script to remove failed MPIO paths. You can end up with failed paths if you make some SAN connection changes.

for disk in `lsdev -Cc disk | grep 2107 | awk '{ print $1 }'`
do
        for path in `lspath -l $disk -F "status connection" | grep Failed | awk '{ print $2 }'`
        do
                echo $disk
                rmpath -l $disk -w $path -d
        done
done

Load balance algorithm w/ AIX and XIV

IBM only supports a queue depth of 1 when attaching to XIV with the default algorithm of round_robin. Usually round_robin or load_balance is the best choice, but since IBM is only supporting a queue depth of 1 at this time, there is a performance penalty for asynchronous I/Os. This looks to have been fixed in 5.3.10 (APAR IZ42730) and 6.1 (APAR IZ43146), but is still broken (probably never to be fixed) in earlier releases.

IBM XIV Thin Provisioning

Thin provisioning on an IBM XIV is pretty hot, but there are some gotchas. Thin provisioning lets you actually allocate more space in LUNs to your hosts than you have in physical storage. So, if you have a lot of filesystems or volume groups that have a lot of free space, that’s cool. Where on other storage systems you would burn the whole space allocated by the LUNs, you’re only allocating (physically) as much as you’re really using. It’s easy to burn yourself, so you have to monitor your free space in the XIV “Storage Pools”. When a Storage Pool fills up, the volumes go to Read-Only mode until you resize the pool.

Determining how many BB credits you need

At 4Gb, a full packet is about 1 km long, and at 2Gb a full packet is about 2km long!  Yes, at any given time 2k of your data is spread from the port to 1km down the cable (as the light travels).  Each packet burns 1 buffer, no matter the size of the packet.  The buffer credit isn’t released until the packet gets to the receiving switch and the sending switch receives an acknowledgment.  So, at 4Gb, you need 2 buffer credit for each km of distance.  1 to fill the 1km pipe to the receiving switch, and 1 waiting for the acknowledgment.

Enableing Access Gateway (NPIV) on Brocade

Brocade’s flavor of NPIV is called Access Gateway. It’s a way to dumb down the switch and make it more of a pass-through device. When AG is enabled, the switch makes much less routing or switching decisions, and passes all the traffic to an upstream switch. The upstream switch ports switch to F ports, and the “egress” ports on the NPIV switch become N ports.