Optimizing Filesystem Layouts in AIX to Increase Performance

AIX

################################################################################
# Optimizing Filesystem Layouts in AIX to Increase Performance
################################################################################

# Regular logical volume striping is no longer a best-practice
# in AIX. The overhead is too high, the consequences of disk
# failure are too great, and it is frankly better done on your
# storage array than in your operating system. You can, however,
# do a form of “poor man’s striping” that can give you
# substantial performance gains by spreading I/O across multiple
# volumes in your volume groups. This procedure describes how to
# safely do this using INTER POLICY, reorgvg and defragfs.
#
# Requirements: You’ll ideally need a version of AIX where the
# defragfs command supports the “-f” flag. You can proceed
# without this, but for maximum efficacy, it’s the way to go.
# You’ll also need your volume groups to be spread over multiple
# physical volumes- this procedure does no good for single-disk
# VGs. Also- do not ever run this on rootvg for any reason.
# Ideally every disk in your VG should be the same size. If it
# is not, you’ll have problems growing filesystems later (a fix
# for this is provided below). These scripts take a LONG TIME to
# run- do not ever run them in the foreground as they can cause
# significant damage if interrupted. Depending on the age and
# workload on your SAN, this procedure might have a negative
# impact while actively running steps I and II. The performance
# gain after the change is usually quite significant, however.

# Procedure:

# I. This degfrags every filesystem on your machine. Skip this
# if your version of AIX does not support defragfs with the
# “-f” flag.
# Create a script in /tmp with the following contents
# (e.g.- vi /tmp/defrag.sh; chmod 755 /tmp/defrag.sh )
#

#!/bin/ksh
{
lsvg -o | while read a
do
lsvgfs $a | while read b
do
echo defrag of $a – $b:
timex defragfs -f -y $b
done
done
echo finished.
}

# Run it with the following:
cd /tmp
rm nohup.out
nohup ./defrag.sh &
tail -f nohup.out
# You’ll know it’s complete when you see “finished.” in the output
# at which point you can exit the tail -f with ctrl-C.

# II. This will set a maximum INTER-POLICY and then reorgvg the
# volume groups to get the best possible spread of I/O across
# the volumes.
# Once again – this does no good for volume groups on a single
# physical volume and you must never do this to rootvg.
# Create a script in /tmp with the following contents
# (e.g. vi /tmp/reorg.sh; chmod 755 /tmp/reorg.sh )

#!/bin/ksh
{
lsvg -o | grep -v rootvg | while read b
do
lsvg -l $b | tail -n +3 | awk ‘{ print $1 }’ | while read a
do
chlv -e x $a
done
echo reorgvg of $b:
timex reorgvg $b
done
echo “finished.”
}

# Run it with the following:
cd /tmp
rm nohup.out
nohup ./reorg.sh &
tail -f nohup.out
# This will take a very long time- frequently on the order of 24
# hours or more!
# You’ll know it’s complete when you see “finished.” in the output
# at which point you can exit the tail -f with ctrl-C.

# III. (optional) If the disks in your volume groups are of
# different sizes, you’ll find that you won’t be able to expand
# filesystems following part II above when the smallest disk
# becomes full. This is not a problem, you just need to turn off
# maximum INTER-POLICY that was set above. You can turn this on and
# off without a reboot or any real impact as follows. (Note this
# does not “undo” the work done above, but it will leave a part
# of the newly resized LVs not optimally striped. Repeating step
# II at a later time will fix this.
# Create a script in /tmp with the following contents
# (e.g. vi /tmp/minpol.sh; chmod 755 /tmp/minpol.sh )

#!/bin/ksh
{
lsvg -o | grep -v rootvg | while read b
do
lsvg -l $b | tail -n +3 | awk ‘{ print $1 }’ | while read a
do
chlv -e m $a
done
done
}
# Run it with the following:
cd /tmp
./minpol.sh

################################################################################

guvf-cebprqher-jnf-gnxra-sebz-orafubeg.pbz
rznvy-oybt@orafubeg.pbz-sbe-qrgnvyf

################################################################################

Using the IBM summ Command for Deeper Analysis of AIX errpt and Sense Data

AIX

################################################################################
# Using the IBM summ Command for Deeper Analysis of AIX errpt and Sense Data
################################################################################

################################################################################
# IBM unofficially offers a perl script called “summ” that will give you
# diagnostic information about I/O errors (mostly disk and disk adapters- but
# also some enhanced information about core dumps, dump devices, etc.) by
# analyzing the data and sense data in errpt. Below is some information about
# this tool.

# Requirements: You’ll need to download the summ script to an AIX machine- it
# doesn’t really need to be the same AIX machine that you want to analyze.
# The machine where summ runs must have perl installed.

# As of this writing (late 2019), you can get a tar file from IBM containing
# the newest version of summ from this link:

https://www.ibm.com/support/pages/sites/default/files/inline-files/$FILE/summ_version_1.tar

# (Note- the link above does not work with curl or wget. You must use a browser.)

# It’s always best to check for newer versions from the main summ page:

https://www.ibm.com/support/pages/node/1072626

# And if that has moved, then you can find it again from the AIX Support Center
# Tools webpage, found here:

https://www-01.ibm.com/support/docview.wss?uid=aixtools_home

# Simply un-tar the file and pipe “errpt -a” into it to run:

tar xvf ./summ_version_1.tar
errpt -a | ./summ

# You can also bring over “errpt -a” output from another machine to analyze:

errpt -a > my.errpt.txt
# transfer my.errpt.txt to the machine and directory containing summ
cat my.errpt.txt | ./summ

# When using summ to analyze disk errors, it will usually produce an error that
# ends with something like this:

EWRPROTECT
EFORMAT
..etc.

# If you are unfamiliar with these failure codes, you can look them up on any
# AIX machine in the file /usr/include/sys/errno.h for more information.
# e.g.:

$grep EWRPROTECT /usr/include/sys/errno.h
define EWRPROTECT 47 /* Write-protected media */

################################################################################
guvf-cebprqher-jnf-gnxra-sebz-orafubeg.pbz
rznvy-oybt@orafubeg.pbz-sbe-qrgnvyf
################################################################################