Where do my I/Os go?!?

You may have noticed that in the “vmstat -s” output there are several counters related to I/O.

...
            344270755 start I/Os
             16755196 iodones
...

Here’s what the man page has to say about these counters:

start I/Os
                   Incremented for each read or write I/O request initiated by
                   VMM. This count should equal the sum of page-ins and page-outs.

iodones
                   Incremented at the completion of each VMM I/O request.

There’s something fishy going on here. The system started roughly 34 times more I/Os than it completed. So, what gives? I had a discussion about this once in which the admin explained that I/Os get started, but not completed. The thinking goes that the I/Os get put into a queue, and if the I/O isn’t filled quickly enough it times out and retries. Sort of like a TCP retransmit for disk I/Os.

But, if that happened, you would think that there would be an error in the errpt, or errors logged in the device subsystem. At least you would see unreasonable I/O service times. And I never see anything like that.

What’s seems to be going on is that the I/O’s get dispatched to the VMM, and if they are sequential they are coalesced into 1 I/O and serviced by 1 system call. When the system detects sequential I/O, the next I/O reads in 2 * j2_minPageReadAhead file pages, the next one after that reads in 4 * j2_minPageReadAhead file pages, and so on until it reaches j2_maxPageReadAhead. Each set of I/Os is serviced by 1 system call, even though it pulls in multiple pages. And that is way better for performance.

So the ratio of start I/Os to iodones is really an indicator of how much sequential I/O the system is doing. And, if the system has lots of sequential I/O, it can be an indicator used to tune JFS2 read-ahead.

And remember, the difference between minfree and maxfree should be at least equal to j2_maxPageReadAhead. So if you make changes, adjust minfree and maxfree accordingly.

Leave a Reply Cancel reply