Patrick - Patrick Vaughan's Web Site

AWS Micro Instance WordPress Tuning

Recently I registered a new domain, rootuser.ninja, for the purpose of testing AWS using the 12 month free tier offer. That gives you 750 hours per month to run a micro server, 1GB RAM, 8GB hard drive and 1 vCPU basically for free for a year. What’s not to like about that?! After the 12 months, each micro server is $0.012 per hour, that’s cheaper than a lot of web hosting and you get a dedicated server.

After spinning up my little VM with a CentOS 7 image I installed:

Apache with all the wordpress pre-requisites
MariaDB
Postfix for SMTP
Dovecot for IMAP
Amavis for Email scanning
Spamassassin for SPAM scanning
Clamav for Email antivirus
WordPress
Webmail

After configuring all the software, importing my WordPress database and installing my themes and plugins, this little free VM from Amazon is basically a web hosting service in a box. I get Email with antispam and antivirus, a web-based Email interface, IMAP and SMTP Email for my mobile devices and website hosting with WordPress; so this little 1GB server is doing a lot of work.

WordPress isn’t exactly a light-weight option for hosting web pages. If I wanted raw speed I would probably install something like Nginx with static pages. But, I wanted to put this little server through its paces, WordPress is very popular and this is an extremely standard configuration that anyone can support.

The next thing I did was went looking for benchmarking tools. I’ve used Siege before, but it’s labor intensive to run a lot of scenarios and extract the data in a manageable format. What I found was LoadImpact.com. This is a basic web stress testing tool that generates good reports, lets you customize the scenario, and exports the data to CSV if you want to do custom reporting. They integrate with NewRelic, so you can overlap your server statistics if you’re a NewRelic customer. And, even better there’s a free tier where you can spin up test scenarios to see how the service works.

This is basically the stock install, with very little tuning done, here are the results with 25 virtual users:

As you can see, the wheels start to come off between 14 and 16 virtual users. Here’s vmstat output from the server:

procs		———————–memory———————-				—swap–		—–io—-		-system–		——–cpu——–
r	b	swpd	free	inact	active	si	so	bi	bo	in	cs	us	sy	id	wa	st
0	0	128724	77520	423532	423464	0	0	2	1	27	72	0	0	100	0	0
0	0	128724	77396	423532	423464	0	0	0	4	40	97	0	0	100	0	0
0	1	195804	60720	450960	412876	0	13416	23894	13423	1095	744	20	4	63	13	0
0	0	300532	62640	452568	394268	0	20946	42329	21000	1591	1239	23	6	46	25	0
0	0	300532	68908	435772	402976	0	0	2900	0	126	137	0	0	98	1	0
0	0	300532	156012	432580	323940	0	0	2718	0	497	479	20	3	77	1	0

What you see is the server going along, completely idle, then the virtual users start. Immediately, there’s pressure on memory and the kernel starts swapping out to disk, the “so” column. At the same time there’s some CPU usage, the “us” column, and some stress on the I/O subsystem, the “wa” column, but not a lot of processes blocked on I/O, the “b” column. This is when the server starts spinning up httpd processes to handle the requests. Once those processes are started, then it’s quiet again.

procs		———————–memory———————-				—swap–		—–io—-		-system–		——–cpu——–
r	b	swpd	free	inact	active	si	so	bi	bo	in	cs	us	sy	id	wa	st
0	0	300532	191748	433244	290536	0	0	474	13	95	156	0	1	99	0	0
1	0	300532	95320	433596	390152	0	0	166	1	499	296	22	3	75	0	0
1	0	377184	62916	417864	407452	0	19766	23860	19785	1521	1241	38	7	46	9	0
0	4	467204	71552	409468	410024	6	13569	39147	13578	1257	785	11	5	63	20	0
0	0	473548	104864	410744	380584	0	1269	12700	1270	574	394	10	1	81	8	0
0	0	473548	109700	412584	379204	0	0	467	9	111	190	0	1	98	1	0
0	0	473548	105012	411360	389924	0	0	1788	18	565	530	20	3	76	1	0

Then, as a few more virtual users startup, there’s another hit, a little larger this time. And another quiet period.

procs		———————–memory———————-				—swap–		—–io—-		-system–		——–cpu——–
r	b	swpd	free	inact	active	si	so	bi	bo	in	cs	us	sy	id	wa	st
1	0	473548	98528	411900	396088	0	0	942	0	481	299	19	2	78	0	0
0	0	496108	98340	420072	389376	6	4513	2642	4514	527	379	22	3	75	1	0
0	0	534600	84128	415984	400040	0	7698	2881	7711	971	509	40	3	53	4	0
0	52	601956	65720	406936	402948	491	21736	76078	21756	2012	870	6	27	41	26	0
0	10	750884	68616	405672	405504	436	24171	70228	24208	1949	1221	19	23	0	57	1
2	15	830092	74364	403444	403420	211	13290	64972	13294	1867	1024	20	7	0	72	1
6	4	918572	65196	408888	413156	952	18221	48684	18230	2651	1327	69	13	0	18	0
0	0	937712	69840	419668	366812	207	3874	22123	3885	1353	713	27	5	58	10	0
0	8	1045236	75040	391872	391388	458	21505	46228	21516	1573	935	16	7	49	28	0
6	3	1092932	59632	393496	410188	765	10330	60089	10331	1784	1112	24	7	0	68	0
1	2	992984	154652	350388	364660	14298	0	26733	11	1582	1501	31	4	51	14	0
1	4	971284	70608	389528	410608	4701	7465	19646	7475	1444	958	43	7	46	5	0
0	0	1002300	125788	382248	367688	1703	6318	54590	6332	1499	989	13	4	24	57	1
0	16	1069088	70420	403344	403080	3590	16073	59748	16212	2275	1373	31	12	3	53	1
0	38	1096192	70276	406416	407696	4713	14312	70625	14500	2055	1096	19	11	0	69	1
2	28	1155908	63692	419576	402056	5941	15580	66120	15596	2094	1472	13	9	0	78	1
0	25	1190732	76336	407212	407328	17626	27545	63867	27556	2726	2134	24	9	0	67	1
0	30	1201480	66744	414152	413876	17462	21536	64895	21564	3241	2028	39	8	0	52	1
0	26	1225468	69684	414936	410704	16846	16354	68164	16368	2887	1989	27	7	0	65	1
0	29	1201136	63584	420580	411332	11280	10859	67350	10869	2440	1660	11	17	0	71	1

Finally, there’s another big spike as more virtual users come online. But, now the server is in trouble. You can see the available memory has dropped into the 5 digits, without really recovering, and the server has started swapping IN from disk, the “si” column. There’s also a lot of I/O going on, and the disk can’t keep up. The CPU is being used, but it’s not busy. At this time, for 25 virtual users, Apache had started over 80 processes!

This is a pretty typical memory constrained system, which is then causing an issue with the disks because of swapping; the server just can’t feed the CPUs fast enough. So, the fix would be to add memory until we have the swapping under control, then see if there’s still an issue with the disk subsystem. We’re not going to do that because then we’ll be into the next server tier, so lets start by limiting the Apache processes.

The web server will be mostly idle, but the server will need to process Email 24×7. I’m sizing it to start a maximum of 20 processes. To limit the number of servers that Apache starts I’ve removed some modules that I won’t be using and added these lines to the httpd.conf file:

StartServers 3
MinSpareServers 3
MaxSpareServers 4
ServerLimit 20

This is very small. It says we’re planning on the Apache server being mostly idle, starting just 3 processes, and we’ll run a maximum of 20 server processes. We will pay a penalty when the Apache processes startup, but we’ll keep more memory free for other processes when it’s not needed.

Here are the results of the second run:

This is a much more linear result, and the numbers are much better as the number of virtual users ramps up past about 16. Here’s that the server was doing:

procs		———————–memory———————-				—swap–		—–io—-		-system–		——–cpu——–
r	b	swpd	free	inact	active	si	so	bi	bo	in	cs	us	sy	id	wa	st
0	0	43656	74440	505044	350028	0	0	29	1	47	113	0	0	100	0	0
0	0	43656	74440	505048	350028	0	0	0	0	48	111	0	0	100	0	0
0	0	64808	76588	490660	361252	0	4230	6077	4231	290	334	2	1	90	7	0
0	0	64808	76464	490728	361256	0	0	6	6	52	122	0	0	100	0	0

Here’s a bump similar to what we saw with the first run, but a bit smaller. After this there’s another second wave as the processes spin up, very similar to the first run. But, look at what the numbers look like toward the end of the run:

procs		———————–memory———————-				—swap–		—–io—-		-system–		——–cpu——–
r	b	swpd	free	inact	active	si	so	bi	bo	in	cs	us	sy	id	wa	st
0	0	794712	104832	419764	388220	6	2910	3136	2923	816	551	38	3	56	2	0
0	26	849428	84784	416532	411036	18	10948	70797	10958	2143	781	24	45	16	14	0
0	0	848928	117556	435740	360032	82	0	5038	21	1236	748	62	4	31	4	0
1	0	848616	88356	445936	378980	77	0	2366	0	461	338	11	1	87	1	0
0	0	848608	103152	448652	361544	6	0	606	10	625	328	30	1	68	0	0
0	0	847308	102752	448372	362116	261	0	294	0	184	212	1	0	98	1	0
0	0	847300	100744	448896	363704	0	0	105	9	475	301	20	1	79	0	0
0	0	847296	100744	448896	363704	0	0	2	12	115	158	0	0	100	0	0
4	0	847288	77224	448896	387736	0	0	0	1	284	208	12	1	87	0	0
0	0	847264	117996	427260	367728	0	0	0	17	1118	477	66	4	30	0	0

Where before we had tons of swapping in and out, and high I/O Wait numbers, the server now looks relatively quiet. CPU usage is relatively low, there’s low disk I/O overall, and a fair amount of free memory even if we are using 800MB of swap.

Now the memory workload is tuned to about the maximum the server can handle and there’s no constraint on CPU or Disk. Unfortunately our response time goes over 20 seconds at about the 14 virtual user mark, that’s pretty slow.

Next we’ll look at how improve the performance of WordPress.

Securing OpenSSH

I was recently researching the latest guidance on securing OpenSSH and came across a web page on a popular site espousing that the easiest way to protect OpenSSH is to define a login banner. While a login banner is useful, especially in a enterprise setting, it’s useless for securing SSH. So, here is my recipe for securing OpenSSH. While testing these, ALWAYS keep a connection open. It’s very easy to break something and if you don’t already have an open connection, you will have successfully locked yourself out.

Change the SSH port. I’m using 8022 in this example, you can use any port you like. This may not be practical in every setting and it’s of marginal value because SSH will report what it is when you connect. But, unless someone is doing an exhaustive search of the open ports on your server, they probably won’t find your open SSH port. Assuming your have selinux enabled, which I recommend, you must first allow SSH to use the target port number. First, review the current selinux context for the port:
```
# semanage port -l | grep 8022
oa_system_port_t tcp 8022
```
You can see, this port is already in use by oa-system. We aren’t using oa-system and nothing else is using port 8022, so we can modify the context to allow ssh to use it:
```
semanage port -m -t ssh_port_t -p tcp 8022
```
Now when we look, we see that both contexts are applied:
```
semanage port -l | grep 8022
oa_system_port_t tcp 8022
ssh_port_t tcp 8022, 22
```
Now, you just need to tell SSH to use the above port by changing the Port line in /etc/ssh/shsd_config:
```
Port 8022
```
Disable IPv6 if your not using it. Some people think this is stupid because they’re not using IPv6 yet. My take on it is that if your not using IPv6 you’re probably not watching the security of it. For instance if you’re setting IPv4 firewalls, but ignore IPv6 and leave IPv6 addresses enabled on your server, an attacker can probably connect to your server. If you’re not using it, just turn it off to lower the attack surface of your servers. In /etc/ssh/sshd_config change the AddressFamily line:
```
AddressFamily inet
```
Set the address SSH listens on. By default OpenSSH will listen to every IP address. You may not want this if you have multiple IP addresses defined on your server, again reducing the attack surface. To do this, update the ListenAddress line with your IP address, you can specify multiple lines:
```
ListenAddress 192.168.1.1
```
Set the SSH protocol to version 2 only. On any modern version of OpenSSH, this is the default, but I specify this anyway. Version 1 has been deprecated for years. Uncomment the Protocol line in /etc/ssh/sshd_config:
```
Protocol 2
```
Disable week keys. You should disable any version1 or DSA keys. This leaves RSA, ECDSA and ED25519 keys enabled. To do this review any HostKey lines in /etc/ssh/sshd_config and comment out ssh_host_key or ssh_host_dsa_key:
```
#HostKey /etc/ssh/ssh_host_key
#HostKey /etc/ssh/ssh_host_dsa_key
```
While you’re looking at the, review the remaining HostKey directives. You can verify the number of bits in the key and the encryption cipher by running this command against them:
```
ssh-keygen -lf FILENAME
```
Disable weak ciphers. We’ll want to remove older weak ciphers. You’ll want to test this before rolling it out widely. I’ve found some very old SSH and SCP/SFTP clients don’t support some of the newer ciphers. Update or add these lines to /etc/ssh/sshd_config:
```
Ciphers aes256-ctr,aes192-ctr,aes128-ctr,arcfour256
MACs hmac-sha2-256,hmac-sha2-256-etm@openssh.com,hmac-md5-etm@openssh.com,hmac-sha1-etm@openssh.com
```
Note: these are valid for RHEL/Centos 7, refer to the sshd_config man page for a list of ciphers valid for your specific version.
Validate logging. Check the SyslogFacility and LogLevel lines in /etc/ssh/sshd_config and verify that you are logging those in syslog.
Disable root logins. Don’t allow users to login as root directly. Users should ideally login with their own IDs and then run whatever they need to run as root with sudo. To disable root logins uncomment or update the PermitRootLogin line:
```
PermitRootLogin no
```
Set the maximum login attempts. By default users get 6 attempts to get their password right. Your organization may set this lower. Once the user has used half their attempts, the remaining attempts are logged. To change this, update the MaxAuthTries setting.
Set strict file permission checking, this is the default. Leaving StrictModes set to “yes” in /etc/ssh/sshd_config tells OpenSSH to only use files that have restrictive permissions. This ensures that other users can’t modify or access ssh key files.
Enable Public Key Authentication, this is the default. Leaving PubkeyAuthentication set to yes allows us to use public/private keys to authenticate. It’s recommended to use key authentication, with a password on the key. This is analogous to two factor authentication, you must have the private key and know the password.
Disable host based authentication.Host based authentication works like the old rhosts or hosts.equiv method. It means that if the login would be permitted by $HOME/.rhosts, $HOME/.shosts, /etc/hosts.equiv, or /etc/shosts.equiv, and the server can verify the client’s host key (see /etc/ssh/ssh_known_hosts), then the login is permitted without a password. This is only slightly more secure than rsh because is prevents IP spoofing and DNS spoofing but are still terribly insecure and are specifically disallowed by most organizations. To disallow this feature, set these options in /etc/ssh/sshd_config:
```
RhostsRSAAuthentication no
Hostbasedauthentication no
IgnoreRhosts yes
```
If you really need password-less authentication, most security policies will allow you to create a role or service account and setup user private key authentication.
Disallow empty passwords. If a user ID has a blank password, we don’t want that user ID to be able to login. Set PermitEmptyPasswords in /etc/ssh/sshd_config:
```
PermitEmptyPasswords no
```
Disallow password authentication. This may not be possible in all environments. If possible, we dis-allow password logins, forcing users to use encryption keys. To do this, disable these options in /etc/ssh/sshd_config:
```
PasswordAuthentication no
ChallengeResponseAuthentication no
```
Set pre-login banner. Yes, I set this option. It’s informational to the person connecting and required by a lot of corporate standards. The pre-login banner usually contains a warning that you’re connecting to a private system, that the access is logged, and a warning not to login if that’s a problem. Update /etc/issue.net with the text you want to display. By default on a lot of systems it shows the kernel version, giving an attacker a little more information. Once you’ve updated your /etc/issue.net file, update the Banner line in /etc/ssh/sshd_config:
```
Banner /etc/issue.net
```
You may also want to update /etc/motd. This file is displayed AFTER the user logs in. I usually put in some banner, usually including the hostname and if the system is test or production. There are a lot of times that something as simple as seeing the hostname after the login keeps people from doing something they shouldn’t. I usually include the hostname in the prompt as well. And I can’t tell you the number of times someone has said “oh, I thought this was the TEST system” right after doing something stupid to their production server. This heads off a lot of those issues.

Once you have everything the way you want it, restart sshd and test:

service sshd restart

Poor Performance and Pending Tasks in Satellite 6.1

We recently installed a new Satellite 6.1 server on VMWare to replace our older physical Satellite server. On our VMWare engineer’s recommendation we configure the VM with 2 cores and 8GB of RAM, a bit under what RedHat calls for. This is from the Red Hat Satellite 6.1 Installation Guide:

Red Hat Satellite requires a networked base system with the following minimum specifications:
64-bit architecture
The latest version of Red Hat Enterprise Linux 6 Server or 7 Server
A minimum of two CPU cores, but four CPU cores are recommended.
A minimum of 12 GB memory but ideally 16 GB of memory for each instance of Satellite. A minimum of 4 GB of swap space is recommended.

Looking at the system, it didn’t appear to be busy. But, tasks would sit in the Pending state and never complete. After a lot of work with Red Hat, we looked at the /etc/default/pulp_workers file:
# Configuration file for Pulp's Celery workers

# Define the number of worker nodes you wish to have here. This defaults to the number of processors # that are detected on the system if left commented here. PULP_CONCURRENCY=1

If PULP_CONCURRENCY were commented out, then the number of worker processes would be set to the number of CPUs on startup, or 2 in our case. But, with it set to 1, there aren’t enough processes to take the work off the queue. Once we changed PULP_CONCURRENCY to 4, the system load increased and tasks started moving. Red Hat wasn’t sure how this is set at install time, but tuning the setting made a big difference.

We also increased the number of vCPUs to 4 and the RAM to 12GB, which dramatically improved performance. VCOPS will tell your VMWare administrator to cut back on resources because Satellite is idle almost all the time. But, you need to tune to the peak load times, when Satellite is synchronizing repositories, installing packages or running puppet tasks.

Our server runs at almost 100% idle, with almost no load, and about 7.5GB of ram used. While running repository synchronization, the CPU utilization goes to 100%, with a run queue of 8 to 15, and about 8.3GB of RAM used.

Stor2RRD Overview

If you manage your own SAN, you’ll eventually be asked questions like “Why are some of my databases slow?”, “Why do we periodically have performance problems?” or “Do we have a hot LUN?”. Modern arrays have real-time performance monitoring, but not all of them have historical data so you can see if there’s a periodic performance issue or if the current performance is out of the ordinary. There are vendor supplied products and lots of third party products that let you gather performance statistics, but they’re usually pretty expensive. If you just need to gather and report on the performance data for IBM V7000, SVC, or DS8000 storage, there is a great FREE product call Stor2RRD.

Stor2RRD is developed by XORUX, the developers of the excellent Lpar2RRD tool, and is free to use with relatively modest fees for support. As it’s name suggests, it collects data from your storage arrays and puts the data into RRD databases. It has much the same requirements as Lpar2RRD, a simple Linux web server with PERL and RRD, and you can run it on the same server as LPAR2RRD. If you have a DS8000 array, you’ll also need the DSCLI package for your storage, or just SSH if you have an SVC or V7000 storage array.

We had issues getting version 0.45 to work. But the developers responded to a quick Email with a preview of the next version, 0.48, which fixed the problem. The setup was pretty simple, we didn’t have any problems with the provided directions, and got everything setup and tested in an couple of hours.

After running the tool for a couple of weeks, we’ve collected what seems like a lot of data. Some of the high-level graphs are very busy, so much that it runs the risk of being “data porn”, data for data’s sake that loses some of it’s usefulness. But, you can drill down from these high-level graphs to the Storage Pool, MDisk, LUN, drive or SAN Port level and get details like IOPS, throughput, latency and capacity.

For instance, here is a graph if the read performance for the managed disks in one of our V7000’s:

That sure looks like mdiskSSD3, the teal blue one, is a hot array. Here is the read response time for that particular mdisk:

The response time isn’t too bad on that array, 3ms Max and 1.4ms on average, which for this data is more than fast enough.

This is just one simple example of the data that Stor2RRD collects. With this data we have real information showing if a system’s slowness is because the server is using an abnormal amount of bandwidth or if we should consider adding more SSD to an over-subscribed pool. And that helps us make intelligent storage decisions and backup our reasoning with real numbers.

For the cost of a small Linux VM, you can deploy a troubleshooting and monitoring tool the rivals some very expensive third party products. And, if it’s helpful in your environment, Stor2RRD annual support is a fraction of the cost of other products.

There is a full featured demo on the Stor2RRD website where you use the tool yourself with the developers data.

Linux LUN Resize

I recently had someone ask me how to rezise a LUN in RHEL without rebooting. The “go-to” method for this admin was to reboot! This is easily accomplished in AIX with “chvg -g”, but how to do this in Linux wasn’t so obvious.

In my example, I’m using LUNs from a SAN attached XIV storage array, using dm-multipath for multipathing and then LVM for carving up the filesystems. After the LUN is resized on the storage array (96Gb to 176GB in my case), we have to scan for changes on the SCSI bus. I’m assuming you have the sg3_utils package installed to get the scsi-rescan command. The simplest thing is to just rescan them all, though you can do them individually if you want:

[root@mmc-tsm2 bin]# scsi-rescan --forcerescan                                                                                                   
Host adapter 0 (qla2xxx) found.
Host adapter 1 (qla2xxx) found.
Host adapter 2 (qla2xxx) found.
Host adapter 3 (qla2xxx) found.
Host adapter 4 (usb-storage) found.
Scanning SCSI subsystem for new devices
 and remove devices that have disappeared
Scanning host 0 for  all SCSI target IDs, all LUNs
Scanning for device 0 0 0 0 ...

This will run for a while as it scans all the LUNs attached to the system. Now lets look at what multipathd thinks:

# multipath -ll dbvg5
dbvg5 (200173800049510dc) dm-7 IBM,2810XIV
size=96G features='1 queue_if_no_path' hwhandler='0' wp=rw

Multipathd now has to be updated with the correct information:

# multipathd -k"resize map dbvg5"
ok

And check it again:

# multipath -ll dbvg5
dbvg5 (200173800049510dc) dm-7 IBM,2810XIV
size=176G features='1 queue_if_no_path' hwhandler='0' wp=rw
...

Now lets look at the PV:

# pvs /dev/mapper/dbvg5
  PV                VG            Fmt  Attr PSize   PFree
  /dev/mapper/dbvg5 tsminst1_dbvg lvm2 a--  96.00g  0

The LUN is resized, multipathd has the correct size, but the LVM PV is still the original size. I’m using whole disk PVs, if you’re using partitions you’ll have to resize the partition with parted or similar tool too. Now we just need to resize the partition:

# pvresize /dev/mapper/dbvg5
  Physical volume "/dev/mapper/dbvg5" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized

# pvs /dev/mapper/dbvg5
  PV                VG            Fmt  Attr PSize   PFree
  /dev/mapper/dbvg5 tsminst1_dbvg lvm2 a--  176.00g  80.00g

Now we can resize our LVs and run resize2fs on the filesystems to take advantage of the additional space.

Is it Power7 or is it Power7+?

UPDATED

Last year I budgeted for 3 P740C model’s to replace 3 P6 550 models that were getting long in the tooth. Because of the long lead time in our budget process and the continued downward pressure from IBM on their pricing, I was able to purchase 4 P7+ 740D models. That is a big win for us.

After implementing new 7042-CR7 model HMCs (which I recommend everyone upgrade to) and powering on our first box, I noticed that the latest HMC code reports the server has a Power7 and not a Power7+. The Power7 chip has been out for nearly a year, and the HMC has been through several updates since then, so why does it not show Power7+ the way it did for Power6+? Here’s what the screen looks like:

So, what does the LPAR say when it’s powered on? Everywhere I look, it’s Power7. Here’s what the system thinks the CPU is:

nim # lsattr -El proc0
frequency   4228000000     Processor Speed       False
smt_enabled true           Processor SMT enabled False
smt_threads 4              Processor SMT threads False
state       enable         Processor state       False
type        PowerPC_POWER7 Processor type        False

And prtconf:

nim # prtconf 
System Model: IBM,8205-E6D
Machine Serial Number: 
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat

I do have a Power7 server running in Power6+ compatibility mode, here’s the output of prtconf on that server:

# prtconf

System Model: IBM,8202-E4B
Machine Serial Number: 10418BP
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 6
Processor Version: PV_6_Compat

So, maybe the OS commands aren’t aware of the CPU compatibility mode. This is the latest firmware and the latest AIX 7.1 level. I’m also running the latest HMC code, and I’ve confirmed the same behavior in the latest VIOS level (2.2.2.2).

Of course, the question was asked, did we really get what we paid for? So, I called my IBM Business Partner and asked their Technical sales team to dig into this. The box does have Power7+ processors, so it’s wasn’t mis-ordered and it WAS built correctly in the factory. They reached out to some other customers running a new P7+ 770, and they’ve confirmed the same behavior there, so I assume this is the same across the product line.

Then I had a bit of luck. As part of this upgrade, I’m testing AME on our non-production servers. The amepat tool, shows the correct processor mode:

nim # amepat

Command Invoked                : amepat

Date/Time of invocation        : Fri Sep 27 11:53:38 EDT 2013
Total Monitored time           : NA
Total Samples Collected        : NA

System Configuration:
---------------------
Partition Name                 : nim
Processor Implementation Mode  : POWER7+ Mode
Number Of Logical CPUs         : 4
Processor Entitled Capacity    : 0.10
Processor Max. Capacity        : 1.00
True Memory                    : 4.00 GB
SMT Threads                    : 4
Shared Processor Mode          : Enabled-Uncapped
Active Memory Sharing          : Disabled
Active Memory Expansion        : Enabled
Target Expanded Memory Size    : 8.00 GB
Target Memory Expansion factor : 2.00

There we see the expected Power7+ mode. This command works and reports the processor correctly on systems without AME enabled, so it can be used on any LPAR to show the correct processor type for Power7+ systems. Here is the output on our Power7 LPAR running in Power6+ mode:

# amepat

Command Invoked : amepat

Date/Time of invocation : Wed Oct 2 12:41:43 EDT 2013
Total Monitored time : NA
Total Samples Collected : NA

System Configuration:
---------------------
Partition Name : tsm1
Processor Implementation Mode : POWER6

So, amepat doesn’t report Power6+ for Power7 systems running in Power6+ mode.

Our IBM client team is looking into this issue, and I expect the relevant commands will be enhanced in a future service pack and HMC level. But, in the mean time, we can prove that what we ordered is what was delivered.

UPDATE :

IBM’s answer:

Historically IBM has not included the “+” on any of our products (ie Power 5+, Power6 or Power7+). You can open a PMR and request a Design Change Request (DCR) to have the “+” added for Power7 servers.

That is an interesting answer to me. We never purchased any Power6+ servers, so I can’t comment on what the OS commands, lsattr and the like, may or may not report. But, the HMC most definitely did report a separate compatibility mode for Power6+. My only thought is that the Power7+ CPU didn’t introduce a new operational mode, which is a little surprising to me because of some of the work done in this chip.

Privileges Necessary for MySQLDump

I recently setup a backup process to dump a MySQL database to a file for backup. With this database, our DBA group has been using the ‘root’ account setup the by software vendor for administration. This server is used for internal system administration and sending performance data off to our software vendor. So, other than being bad form to use the ‘root’ ID, there’s probably no regulatory responsibility to use user or role specific IDs.

That’s all well and good, but I’m not comfortable putting the ‘root’ ID password in scripts or backup products. And, I need to ensure the mysqldump command is run and completes before the backup begins, so the natural thing to do is make the backup software run mysqldump as a pre-backup job with a dedicated mysql user ID. While I’m at it, we really should give the backup user ID the minimum privileges necessary. So, first I create a user:

create user 'backup_user'@'localhost' identified by 'somepassword';

Now what privileges do we need? Here’s a list of privileges we may need:

select	This is a given, without select we won’t get very far
show view	We need this if we want to backup views
trigger	If we have triggers to backup, we’ll need this
lock tables	This is needed so mysqldump can lock the tables. Don’t need it if using –single-transaction
reload	We need this if using –flush-logs
file	We would need this if we were writing the files with mysqldump, and not redirecting the output to a file with ‘>’

So, we can grant these privileges to all the schemas, or just the schema’s we want to backup:

grant select, show view, trigger, lock tables, reload, file on *.* to 'backup_user'@'localhost';
flush privileges;

Sending AIX Syslog Data to Splunk

I recently put up a test Splunk server to act as a central syslog repository, one of the issues in our security audits. There are some “open” projects to do this, but Splunk has a lot of features and is “pretty” compared to some of the open alternatives. Getting data from our Linux hosts was a snap, but data from our AIX hosts has a few minor annoyances. Fortunately, we were able to overcome them.

The syslogd shipped with AIX only supports UDP. rsyslog supports TCP, but hasn’t been ported to AIX. Another option is syslog-ng, for which there are open source and commercial versions compiled for AIX. But, after installing all the dependent RPMs for the open source version, it would only segfault with no indication of the problem. So, to support syslog via UDP, on the Splunk server you have to enable a UDP source. That’s easily accomplished by going to Manager -> Data Inputs -> UDP -> New, enter 514 for the port, set sourcetype to “From List”, and source type of “syslog”. Check “More settings” and select DNS for “Set host” and click Save.

Once that is done, add a line to /etc/syslog.conf on the source node to send the data you want Splunk to record to the Splunk server. If your splunk server is named “splunk” it would look something like this:

*.info        @splunk

One of the problems with AIX’s implementation of syslog is it’s format. Here’s what Splunk records:

3/26/13 12:32:07.000 PM	Mar 26 12:32:07 HOSTNAME Mar 26 12:32:07 Message forwarded from HOSTNAME: sshd[21168310]: Accepted publickey for root from xxx.xxx.xxx.xxx port 39508 ssh2 host=HOSTNAME   Options|  sourcetype=syslog   Options|  source=udp:514   Options|  process=HOSTNAME

The AIX implementation of syslog by default adds “Message forwarded from HOSTNAME:”. That’s a little annoying to look at, but worse is that Splunk uses the hostname of the source as the process name, so you lose the ability to search on the process field. You can turn this off on the source with:

stopsrc -s syslogd
chssys -s syslogd -a "-n"
startsrc -s syslogd

TSM Deduplication Increases Storage Usage (for some values of deduplication)

I ran into an interesting problem recently. A de-duplicated pool containing TDP for Oracle backups was consuming much more space than would otherwise be indicated. Here’s what the occupancy looked like:

Node Name         Storage         Number of     Logical
                  Pool Name           Files     Space
                                                Occupied
                                                (MB)
----------        ----------    -----------    ----------- 
CERN_ORA_ADMIN    CERNERDISK            810      31,600.95 
CERN_ORA_BUILD    CERNERDISK          1,189      74,594.84 
CERN_ORA_CERT     CERNERDISK            402   3,876,363.50 
CERN_ORA_TEST     CERNERDISK            905   7,658,362.00
LAW_ORA_PROD      CERNERDISK          1,424     544,896.19 
OEM_ORA_RAM       CERNERDISK          2,186     524,795.31

That works out to about 12.7 TB. And, here’s what the storage usage looked like:

Storage         Device          Estimated       Pct       Pct     High  Low  
Pool Name       Class Name       Capacity      Util      Migr      Mig  Mig  
                                                                   Pct  Pct  
-----------     ----------     ----------     -----     -----     ----  ---  
CERNERDISK      CERNERDISK       47,319 G      90.4      90.4       97  90

That’s about 47TB of storage, 90% used, which works out to just over 42TB of used storage. On top of that, TSM was reporting a “savings” of about 2TB, which means I should have about 44TB of data stored on disk. But only 12.7TB was actually backed up!

IBM has built a few interesting scripts to collect TSM data for support lately. One of which is tsm_dedup_stats.pl. This little Perl script collects quite a bit of information relating to deduplication. Here’s some summary info from that script ran a couple of days later:

Pool: CERNERDISK
  Type: PRIMARY		   Est. Cap. (MB): 48445474.5  Pct Util: 88.7
  Reclaim Thresh: 60	Reclaim Procs: 8		  Next Pool: FILEPOOL
  Identify Procs: 4	  Dedup Saved(MB): 2851277


  Logical stored (MB):	  9921898.18
  Dedup Not Stored (MB):  2851277.87
  Total Managed (MB):	 12773176.05

  Volume count:			        4713
  AVG volume size(MB):	        9646
  Number of chunks:	       847334486
  Avg chunk size:	           87388

There’s some interesting stuff in there. There’s almost 10TB of logical storage in the storage pool, almost 3TB saved in deduplication, and about 12TB total managed storage, which matches the output from “Q OCCupancy” pretty closely. The output also has a breakdown by client type and storage pool of the deduplication rate:

Client Node Information
-----------------------
    DP Oracle:		7
      Stats for Storage Pool:		CERNERDISK
        Dedup Pct:		22.28%
    TDPO:		1
      Stats for Storage Pool:		CERNERDISK
        Dedup Pct:		23.14%

So far so good, tsm_dedup_stats.pl matches what we’re seeing with the regular TSM administrative commands.

At this point, I ran “REPAIR OCC“. There’s a possible issue where the occupancy reported and the storage reported by “Q STG” can be inacurate. This new command validates and corrects the numbers reported. Unfortunately, running this had no effect on the problem.

The next thing we looked at was the running deduplication worker threads. After the “IDentify DUPlicates” command locates and marks “chunks” as duplicates, background processes run and actually remove the duplicated chunks. Running “SHOW DEDUPDELETE”, one of the undocumented show commands in TSM, reports the number of worker threads defined, the number of active threads, and which node and filesystem IDs are currently being worked on. If all the worker threads are active for a significant amount of time, more worker threads can be started by putting the “DEDUPDELETIONTHREADS” option in the dsmserv.opt file and restarting the server. The default is 8, on the bigger servers I’ve bumped that to 12. Bumping this number will generate more log and database traffic as well as drive more CPU usage, so you’ll want to keep an eye on that.

I only had 4 threads busy routinely, so adding more threads wouldn’t have helped. But, those threads had were always working on the same node and filespace. The node IDs and node names can be pulled out of the database by running this as the instance owner:

db2 connect to tsmdb1
db2 set schema tsmdb1
db2 "select NODENAME, NODEID, PLATFORM from Nodes"

Those 4 node IDs mapped to 4 of the nodes with data in our problem storage pool. You can see how much work is queued up per node ID with this SQL:

db2 connect to tsmdb1
db2 set schema tsmdb1
db2 "select count(*) as \"chunks\", nodeid from tsmdb1.bf_queued_chunks group by nodeid for read only with ur"

What was happening is that clients were backing up data faster than the TSM server could remove the duplicate data. Part of the problem is probably that so much data is concentrated in one filespace on the TDP clients, so only one deduplication worker thread can process each nodes data at any one time.

We could do client-side deduplication to take the load off the server, but we’ve found that with the big TDP backups that slows down the backup too much. So, with only saving about 2TB of storage on 12TB of backed up data, we came to the conclusion that just turning off deduplication for this storage pool was probably our best bet. After turning off deduplicaiton, it took about 10 days to work through the old duplicate chunks. Now the space used and occupancy reported are practically identical.

Installing the XIVGui on Fedora 16

I’ve been running the XIVGui on a Windows7 VM so that I have it available from anywhere. That does work, but then I have to launch an rdesktop session, login, then launch the XIVGui, and login again. I finally got tired of the extra steps and decided to load the XIVGui when I upgraded to Fedora 16. I considered making an RPM, but I’m sure IBM would frown on redistributing their code. These manual steps work great on Fedora 16, should work fine on Fedora 15. I haven’t tested it with RHEL or other versions.

First, you need the 32bit version of libXtst, even if you’re using the 64bit client:


yum install libXtst-1.2.0-2.fc15.i686

Then just download the package from IBM’s ftp server, uncompress it, and move the resulting directory to someplace on your system, I used /usr/local/lib.


tar -zxvf xivgui-xxx-linux64.tar.gz
mv XIVGUI /usr/local/lib/

Then, we just need to make a couple of .desktop files.

/usr/share/applications/xivgui.desktop:


[Desktop Entry]
Name=XIVGui
Comment=GUI management tool for IBM XIV
Exec=/usr/local/lib/XIVGUI/xivgui
Icon=/usr/local/lib/XIVGUI/images/xivIconGreen-32.png
Terminal=false
Type=Application
Categories=System;
StartupNotify=true
X-Desktop-File-Install-Version=0.18

/usr/share/applications/xivtop.desktop:


[Desktop Entry]
Name=XIVTop
Comment=GUI performance tool for IBM XIV
Exec=/usr/local/lib/XIVGUI/xivtop
Icon=/usr/local/lib/XIVGUI/images/xivIconTop-32.png
Terminal=false
Type=Application
Categories=System;
StartupNotify=true
X-Desktop-File-Install-Version=0.18

Now XIVGui and XIVTop should show up under “System Tools”.