Showing posts with label mailx. Show all posts
Showing posts with label mailx. Show all posts

Saturday, May 18, 2013

Mailing alerts for Administrator

As a Middleware Admin, we might need to check the mail service is working or not on the Linux machine or Soloris machine on which our Weblogic running. To monitor and send the traps for issues we must have mail service enabled. In Linux how to check mail service up? We have tried to test the sendmail command from the Linux machine. with the following command:
echo "This is the body."| mailx -s "mailx Test1" bhavanishekhar@gmail.com
echo "test" | mailx -s "test_sub" bhavanishekhar@gmail.com
When I tested with the above test message got the error saying as follows: postdrop: warning: unable to look up public/pickup: No such file or directory. There is no mail service is running you validate the same with the following: We have several options checking the processes list for the pattern 'mail'.
$  ps aux | grep mail |grep -v grep
$
or use other option is usually the mail service on Unix machine runs on port 25, so grep it from the network status.
$  netstat -nl|grep 25
Already working but not working then how to debug that wheather it is working or not? My next choice is checking the service.
$ svcs | grep -i sendmail
$ ps -ef |grep -i sendmail

Thanks for these valuable inputs from my mate Mangaleshwaran.

Keep osting us your experience on mailx or sendmail.

cheers!

Tuesday, June 29, 2010

Monitoring CPU Load Averages with Shell Script

Today I started re-inventing myself, started looking what all I did for me and my team to perform in better ways. Remembered those we used to open UNIX SSH windows to monitor How the CPU load average in each site. While doing this monitoring activity on site1, there might be possible that some other site reach to overload, which leads to uncontrol tendency to work. It was funny, my dear buddy named it as 'Barbar work!!' :)

After little R & D on Google/Internet found few suitable solutions. I had chosen 'uptime' command running with remote SSH connection in a loop. Adding more value to this sending a mail on the event of crossing the threshold value. This threshold will be vary depending upon the application and CPU power. Trail and error make you to identify what could be the threshold. Defining these threshold values Venu GopalRao helped a lot. Once script started working he was amazed and appreciated as well.

This script can be run forever with a specified time interval. You can use 'at' command or 'crontab' also for this task. I prepared a 'bash' script that could work for Solaris and also on Linux.

Before to this script we need to establish the password less connection to all the remote machines with 'key-gen' command. Public key authentication, which is the good choice password less connecting remote UNIX machines. Here, you can use any choice for encryption algorithms such as RSA, DSA etc.,


Customization/Cosmotics to this script
When you run this script at your prompt you can see the high load average server details in red color which makes sense to act up on that quicker. All server list I had kept in a plan text file and accessed it line by line as array for looping.

#!/bin/bash
#======================================================
# This script will check CPU Load, network ping status
# and also checks diskspace on every machine
#======================================================
RECIPIENTS="pavanwla@yahoo.co.in"
LOG=./load.log
 
check_load()
{
        loadnow=`echo $msg| cut -d, -f4 | cut -d: -f2 | cut -d. -f1`
        d=`echo $msg |awk '{print $((NF-1))}'`
        SD=`date "+%Y-%h-%d@%H:%M:%S"`
        echo $SD '****'
        if [ $loadnow -gt 14 ]; then
                echo -e ' \033[31m' $server ' ' $loadnow '\033[m'>>$LOG
                echo $SD $server ' ' $loadnow |mailx -s LOAD_WARN $RECPIENTS
        elif [ $loadnow -gt 19 ]; then
                echo -e ' \033[31m' $server ' ' $loadnow '\033[m'>>$LOG
                echo $SD $server ' ' $loadnow |mailx -s LOAD_CRITICAL $RECPIENTS
        else
                echo -e $server '\t' $loadnow '\t' $p '\t'$d >>$LOG
        fi
}

#==============================================================
#                 M A I N  S C R I P T
#==============================================================
if [ -f $LOG ]
then
        rm $LOG
fi
serlist=`cat prodServers.txt`
echo -e "========================================================">>$LOG
echo -e "  HOSTNAME  CPU Load     Network status       Disk Space">>$LOG
echo -e "========================================================">>$LOG
 
for server in $serlist
do
        echo 'connecitng to ' $server
        msg=`ssh $server "uptime; df -k /app|grep app |awk '{print \$5}'"`
        p=`ping -s $server 56 2 |grep loss | awk -F',' '{ print $3 }'`
        check_load
done
cat load.log
Please make sure that you must have prodServers.txt file in the same script path. Sample prodServers.txt file as follows:
myprod.server1.com
myprod.server2.com
...
myprod.server20.com
Upgrade Script
Adding more flavor to the load average script finding the disk space on every machine and also verifying network connectivity that ping response to every machine. Initially, I made it with two ssh commands one is for finding load average on each remote machine, other one is to check disk space on each machine. But it is not a good scripting way. With the help of the linkedin discussion I have updated it to single ssh command so that it will process faster by making less ssh sessions.

What is Next step??
If you find CPU load average is going above the threshold then you need to prepare yourself alert. Open that concern UNIX machine and find the causing process on that machine with 'top' command or 'prstat -L -a ' command options respective UNIX environment.

Take the Thread dumps of that culprit java process id of WebLogic instance. If CPU load reaching more than threshold then terminate that process/instance.
Analyze why that time CPU load gone high what thread were doing that time.

Note: This script created and executed on Solaris which remotely connects Linux and Solaris machines.

Good Forum Reference:
1. Linkedin Discussion
2. http://www.daniweb.com/forums/thread48764.html">Shell Script for Load monitoring!

Comments are most welcome!! HAPPY TO HELP!!

Blurb about this blog

Blurb about this blog

Essential Middleware Administration takes in-depth look at the fundamental relationship between Middleware and Operating Environment such as Solaris or Linux, HP-UX. Scope of this blog is associated with beginner or an experienced Middleware Team members, Middleware developer, Middleware Architects, you will be able to apply any of these automation scripts which are takeaways, because they are generalized it is like ready to use. Most of the experimented scripts are implemented in production environments.
You have any ideas for Contributing to a Middleware Admin? mail to me wlatechtrainer@gmail.com
QK7QN6U9ZST6