Cover V12, I08

Article
Listing 1

aug2003.tar

Monitoring Sun Volume Manager

Andrew Kyle

Over the years, I've seen a lot of servers that have been built and then eventually neglected. I often find that the servers have been mirrored using "Sun Volume Manager" (or as it was previously known, "Solstice Disksuite") and their configuration has been set and then forgotten. These servers may have been configured for redundancy at one stage, but over time, disks die, configurations change, and people make mistakes. For these reasons, it is imperative for metadevices to be actively monitored to make sure they are doing what they are intended to do.

Here I will provide a script for monitoring volume manager setup and briefly describe how it works. Metadevice monitoring should go hand in hand with the general monitoring of the server that includes monitoring logs, filesystems, and performance.

The script provided in this article (Listing 1) is a simple but effective way to monitor the volume manager setup. It basically looks for metadevices that are not in the "okay" state or hot spares that are not in the "available" state. It also does one other handy check. It looks for devices that have been configured to be mirrors and have only one sub-mirror. This is an excellent way to tell whether you have forgotten to attach your mirrors.

This script does not require any special privileges, so long as it can run metastat (which has default permissions of 755). This means the script can be executed by any user's cron for automation. I suggest running the script as often you feel necessary for effective monitoring and to allow sufficient time to react to incidents that happen. I run it every 10 minutes from a monitoring collection script that filters out any messages that are the same and that have already been sent that day. In doing so, I avoid having my email box cluttered with duplicate messages.

As a further note, you must ensure that sendmail (or equivalent MTA) is running on your server. Sendmail should always be running in queue mode (-q), and only running in listening mode (-bd) if absolutely necessary. Mail can have network connection problems, or not be sent due to server load, etc. and will just be queued on the server. However, there is no point in queuing the mail if you aren't going to receive it within an appropriate time. Running the sendmail daemon in queue mode will allow the queue to be reprocessed at a configured amount of time later and sent out as soon as possible.

Conclusion

Monitoring is a major part of a systems administrators job. It is important that all aspects of a server that affect it's performance, reliability and availabilty are monitored. This simple script will effectively catch any problems with the server's metadevices so prompt action can be taken to maintain redundancy and reliability.

Andrew received a Bachelor of Technology in Computer System Engineering from Massey University, NZ in 1994. Since then, he's done UNIX systems administration in Brisbane for Queensland Police and CITEC. He has been concentrating on Solaris during the past 5 years by contracting in London, mainly for a Securities Bank and now working for CSC in Sydney, Australia.