對於MySQL數據庫主從復制延遲的監控,我們可以借助percona的有力武器pt-heartbeat來實現。pt-heartbeat通過使用時間戳方式在主庫上更新特定表,然後在從庫上讀取被更新的時間戳然後與本地系統時間對比來得出其延遲。本文主要是通過腳本來定期檢查從庫與主庫復制的延遲度並發送郵件,供大家參考。
有關pt-heartbeat工具的安裝可以參考:percona-toolkit的安裝及簡介
1、腳本概述
2、腳本內容
[mysql@SZDB run]$ more ck_slave_lag.sh #!/bin/bash #set -x if [ $# -ne 3 ];then echo "usage:" echo "ck_slave_lag.sh <Servier-id> <MaxLag> <LogDir>" exit 0; fi # Author : Leshami # Blog : http://blog.csdn.net/leshami ServerID=$1 MaxLag=$2 LogDir=$3 Timestamp=`date +%Y%m%d_%H%M%S` Rentition=7 LogFile=$LogDir/slave_lag_$Timestamp.log LagDetail=$LogDir/slave_lag_Detail_$Timestamp.log [email protected] echo $ServerID echo $MaxLag echo $LogDir echo $LogFile echo $LagDetail echo $mailadd if [ ! -d $LogDir ];then mkdir -p $LogDir fi Lag=`/usr/bin/pt-heartbeat --user=monitor --password=xxx -S /tmp/mysql.sock -D test --master-server-id=$ServerID --check` Lag=`echo ${Lag%.*}` #Lag=3 echo $Lag ptStatus=`ps -ef|grep pt-heart|grep daemonize` echo $ptStatus if [ $Lag -gt $MaxLag ]; then echo "The current date is `date` at `hostname`." >>$LogFile echo "The current lag log file is $LogFile." >>$LogFile echo "The current replication lag is $Lag." >>$LogFile echo "The replication lag is larger than max lag $MaxLag." >>$LogFile if [ -z "$ptStatus" ] ; then echo "Start a monitor daemon with below command: " >>$LogFile echo "pt-heartbeat --user=monitor --password=xxx -S /tmp/mysql.sock -D test " >>$LogFile echo " --master-server-id=11 --monitor --print-master-server-id --daemonize --log=$LagDetail" >>$LogFile /usr/bin/pt-heartbeat --user=monitor --password=xxx -S /tmp/mysql.sock -D test \ --master-server-id=$ServerID --monitor --print-master-server-id --daemonize --log=$LagDetail echo "More detail please check lag log from $LagDetail." >>$LogFile cat $LogFile | mutt -s "Found slave lag on `hostname`." $mailadd fi fi if [ -n "$ptStatus" ] ; then STime=`ps -ef|grep pt-heart|grep daemonize |gawk '{print $5}'` Pid=`ps -ef|grep pt-heart|grep daemonize |gawk '{print $2}'` STime=`date '+%Y%m%d'`" "$STime s_STime=`date -d "$STime" '+%s'` s_ETime=`date +%s` DiffSec=`expr $s_ETime - $s_STime` echo $STime echo $s_STime echo $s_ETime echo $DiffSec if [ "$DiffSec" -gt 1800 ]; then echo "kill -9 $Pid" kill -9 $Pid fi fi # Remove history slave lag log. find $LogDir -name "*slave_lag*" -ctime +$Rentition -delete exit
3、部署參考
[mysql@SZDB run]$ crontab -l #check slave lag */1 * * * * /run/ck_slave_lag.sh 11 3 /log/SlaveLag