In line with my lack of a “proper” Monday post, I’m skipping the “proper” Wednesday post to bring you an AIX specific tool I was asked to research for an internal project this week at work.
Many of you are likely familiar with Linux’s “inotify” process, which can be used to trigger a response to a file change in real time. This is great for filesystem events. You might even be able to crowbar this into notifying on process death events by monitoring something out of /proc, if you’re brave enough to go poking around in there.
On AIX, there is no native “inotify.” There is, however, a special file system called the “Autonomic Health Advisor File System (AHAFS.)” This is available by default, and is easy to set up.
Our particular scenario was to monitor when a specific process dies. We want a real time notification, and log of the event. I spent a few hours reading up on how this works, then played with it until I was sure I understood the best way to use it, and this is what I came up with.
There is a sample perl script “aha.pl” in the /usr/samples/ahafs/bin directory. This can be used out of the box for most things. In order to monitor a process, we need to know the path it is in. In my example for this blog post, (not the actual service I needed to monitor,) we’ll say we want to monitor the apache httpd service. It will be located at /usr/local/apache/bin/httpd for this example.
In order to monitor the death of this process, we would need to mimic that filesystem structure underneath the appropriate “Event Factory” directory. In this case, that would be /aha/cpu/processMon.monFactory/ which means our final structure would be this:
Before we just go making the directory, though, we need to mount the ahafs filesystem to “/aha.” This is as simple as:
sudo mkdir /aha
sudo mount -v ahafs /aha /aha
Once that’s done, we can build out the structure. The “cpu/processMon.monFactory” was created by mounting, but we need the subdirectory structure.
sudo mkdir -p /aha/cpu/processMon.monFactory/usr/local/apache/bin
Now we can monitor at any time. The monitor itself will sit in “wait” status until the process being monitored dies, then it will print some event information. This means that we key off of this monitor process coming to completion before we do any notifications of our own. My recommendation is to tee the output of the event notification with “-a” to append, so that you have a log you can review, then have it email your support email account.
/usr/samples/ahafs/bin/aha.pl -m /aha/cpu/processMon.monFactory/usr/local/apache/bin/apache.mon "CHANGED=YES;INFO_LVL=3" 2>&1 | tee -a /var/log/apache.monitor.log | mailx -s "Apache died on $( hostname )" firstname.lastname@example.org
This should probably be placed within a script, and that script should be called after starting apache. Of course, this is probably a terrible process to monitor, since httpd spawns off forked child processes all the time to handle requests, but it was an easy one to use as an example to get the point across on how to use this.
Also, it should be noted that for this specific scenario (PID dies, we want notification when it does,) if the process in question is one that doesn’t daemonize on its own, you can probably get away with just doing this:
echo "/usr/local/bin/my_process" | at now
When the process dies, the at job “dies” and you get an email notification from the at job termination itself. You don’t get a local log appended, and you are relying solely on email for the notification, but it’s less work overall.