导航
导航
文章目录
  1. 一、使用方法:
  2. 二、Shell脚本内容
  3. 三、注意事项
  4. 四、更新历史:

Linux 进程自动监控shell脚本V2

该脚本用于监控VPS服务器负载,Web程序内存及CPU使用。当服务器系统负载或内存使用达到预设值,则重启该程序,或者某个php-cgi进程占用CPU过大,则直接kill掉该进程。目的在于缓解服务器资源耗尽导致意外宕机等情况。

一、使用方法:

git clone git://gist.github.com/1216837.git gist-1216837
vim gist-1216837/sys-mon.sh //修改内存、CPU等预设阀值
mkdir /var/script
mv gist-1216837/sys-mon.sh /var/script

设置每分钟执行一次

crontab -e
* * * * * /bin/bash  /var/shell/sys-mon.sh

二、Shell脚本内容

建议打开下面网址查看最新版本。

https://gist.github.com/1216837

#! /bin/bash
#====================================================================
# sys-mon.sh
#
# Copyright (c) 2011, WangYan <webmaster@wangyan.org>
# All rights reserved.
# Distributed under the GNU General Public License, version 3.0.
#
# Monitor system mem and load, if too high, restart some service.
#
# See: http://blog.wangyan.org/sys-mon-shell-script.html
#
# V 0.5, Date: 2011-12-08
#====================================================================

# Need to monitor the service name
# Must be in /etc/init.d folder exists
NAME_LIST="httpd nginx mysql"

# Single process to allow the maximum CPU (%)
PID_CPU_MAX="25"

# The maximum allowed memory (%)
PID_MEM_SUM_MAX="95"

# The maximum allowed system load
SYS_LOAD_MAX="6"

# Log path settings
LOG_PATH="/var/log/sys-mon.log"

# Date time format setting
DATA_TIME=$(date +"%y-%m-%d %H:%M:%S")

# Your email address
EMAIL="webmaster@example.com"

# Your website url
MY_URL="http://106.187.38.210/p.php"

#====================================================================

for NAME in $NAME_LIST
do
    PID_CPU_SUM="0";PID_MEM_SUM="0"
    PID_LIST=`ps aux | grep $NAME | grep -v root`

    IFS_TMP="$IFS";IFS=$'\n'
    for PID in $PID_LIST
    do
        PID_NUM=`echo $PID | awk '{print $2}'`
        PID_CPU=`echo $PID | awk '{print $3}'`
        PID_MEM=`echo $PID | awk '{print $4}'`
#       echo "$NAME: PID_NUM($PID_NUM) PID_CPU($PID_CPU) PID_MEM($PID_MEM)"

        PID_CPU_SUM=`echo "$PID_CPU_SUM + $PID_CPU" | bc`
        PID_MEM_SUM=`echo "$PID_MEM_SUM + $PID_MEM" | bc`

        if [ `echo "$PID_CPU >= $PID_CPU_MAX" | bc` -eq 1 ];then
            if [[ "$NAME" = "php-fpm" || "$NAME" = "httpd" ]];then
                sleep 5
                if [ `echo "$PID_CPU >= $PID_CPU_MAX" | bc` -eq 1 ];then
                    echo "${DATA_TIME}: kill ${NAME}($PID_NUM) successful (CPU:$PID_CPU)" | tee -a $LOG_PATH
                    kill $PID_NUM
                fi
            else
                echo "${DATA_TIME}: [WARNING!] ${NAME}($PID_NUM) cpu usage is too high! (CPU:$PID_CPU)" | tee -a $LOG_PATH
            fi
        fi
    done
    IFS="$IFS_TMP"

    SYS_LOAD=`uptime | awk '{print $(NF-2)}' | sed 's/,//'`
    SYS_MON="CPU:$PID_CPU_SUM MEM:$PID_MEM_SUM LOAD:$SYS_LOAD"
#   echo -e "$NAME: $SYS_MON\n"

    SYS_LOAD_TOO_HIGH=`awk 'BEGIN{print('$SYS_LOAD'>'$SYS_LOAD_MAX')}'`
    PID_MEM_SUM_TOO_HIGH=`awk 'BEGIN{print('$PID_MEM_SUM'>'$PID_MEM_SUM_MAX')}'`

    if [[ "$SYS_LOAD_TOO_HIGH" = "1" || "$PID_MEM_SUM_TOO_HIGH" = "1" ]];then
        /etc/init.d/$NAME stop
        sleep 5
        for ((i=1;i<4;i++))
        do
            if [ `pgrep $NAME | wc -l` = "0" ];then
                echo "$DATA_TIME: Stop $NAME successful! ($SYS_MON)" | tee -a $LOG_PATH
                break
            else
                echo "${DATA_TIME}: [WARNING!] Stop $NAME failed[$i]! ($SYS_MON)" | tee -a $LOG_PATH
                pkill $NAME && killall $NAME
            fi
        done
        /etc/init.d/$NAME start
        sleep 5
        for ((ii=1;ii<4;ii++))
        do
            if [ `pgrep $NAME | wc -l` != "0" ];then
                echo "$DATA_TIME: Start $NAME successful!" | tee -a $LOG_PATH
                break
            else
                echo "${DATA_TIME}: [WARNING!] Start $NAME failed[$ii]! ($SYS_MON)" | tee -a $LOG_PATH
                /etc/init.d/$NAME start
                sleep 5
            fi
        done
        if [ `pgrep $NAME | wc -l` != "0" ];then
            echo "${DATA_TIME}: [ERROR!] Start $NAME failed! ($SYS_MON)" | mail -s "Start $NAME failed" $EMAIL
        fi
    fi
done

STATUS_CODE=`curl -o /dev/null -s -w %{http_code} $MY_URL`
#echo -e "STATUS CODE: $STATUS_CODE\n"

if [ "$STATUS_CODE" != "200" ];then
    sleep 3
    STATUS_CODE=`curl -o /dev/null -s -w %{http_code} $MY_URL`
    if [ "$STATUS_CODE" != "200" ];then
        echo "${DATA_TIME}: [WARNING!] Website Downtime! ($SYS_MON)" | tee -a $LOG_PATH
        echo "${DATA_TIME}: [WARNING!] Website Downtime! ($SYS_MON)" | mail -s "Start $NAME failed" $EMAIL
    fi
fi

脚本内容不难理解,原理解释可参考 Linux 进程自动监控shell脚本

三、注意事项

  • NAME_LIST 指定的监控程序必须在/etc/init.d 文件夹中存在,并且支持stop和start操作
  • PID_CPU_MAX 指的是单个进程的CPU占用,只针对php-fpm或httpd。
  • PID_MEM_SUM_MAX 指的是该程序所有进程实际内存占用,而并非系统总内存。
  • EMAIL 只有在程序启动失败后,你才能收到邮件提醒。

四、更新历史:

2011.11.28: 去掉nginx502状态监控,完善进程cpu监控,修正数据不准确等问题。
2011.12.07: 继续修正cpu监控不正确问题,增加宕机后邮件通知功能。

支持一下
扫一扫,支持一下