As an experienced CMS website operation personnel in the security industry, I fully understand the importance of stable system operation for the business.Establishing an automatic alert mechanism is crucial for ensuring service continuity in the event of startup failure with AnQiCMS.start.shScript it to send alert emails promptly when startup exceptions occur, so that the operations team can obtain information and take countermeasures in the first time.

Understand AnQiCMSstart.shscript

During the deployment of AnQiCMS, we usually use scripts to check and start the AnQiCMS application. According to the provided documentation, the originalstart.shscript checks and starts the AnQiCMS application. According to the provided documentation, the originalstart.shThe main function of the script is to determine if the AnQiCMS process is running, and if not, attempt to start it. This script is usually executed throughcronTask runs every minute to ensure the service's automatic recovery capability. The core logic of the script is to useps -ef | grep '\<anqicms\>' |grep -v grep |wc -lcommands to countanqicmsthe number of processes, if the number is zero, then executenohup $BINPATH/$BINNAME >> $BINPATH/running.log 2>&1 &Command to start the AnQiCMS application.

Although this script can automatically attempt to restart the service, it lacks the ability to notify the operations team upon a failed start.This means that if AnQiCMS attempts to start multiple times in a row and still fails, we may not be able to know in time, leading to a long-term service interruption.To solve this problem, we need to introduce an email alerting mechanism.

Prerequisites for introducing the email alert mechanism

After modifyingstart.shBefore the script, we need to ensure that the server environment has the ability to send emails. The most common way is that the mail transfer agent (MTA) is already configured on the server, for examplesendmail/Postfixormsmtpand installed,mailcommand (usually){mailxPack provided).If the email sending service has not been configured on the server, you will need to install and configure it according to the Linux distribution you are using.echo "Test" | mail -s "Test Subject" [email protected]If you can successfully receive an email, it means that the email environment is ready.

If the server cannot configure the email client, you can also consider usingcurlCommand the email service provider's API, or write a simple Python script to send emails via SMTP. But to maintainstart.shSimplicity and versatility of the script, we recommend using it firstmaila command.

Modifystart.shThe script is used to implement the alarm function

Now, we will discuss the existingstart.shThe script has been modified to include fault detection and email alert logic. Below is the modified script content and detailed explanation of the added sections.

#!/bin/bash
### check and start AnqiCMS with email alert
# author fesion
# the bin name is anqicms
BINNAME=anqicms
BINPATH=/www/wwwroot/anqicms

# --- Email Alert Configuration ---
RECIPIENT_EMAIL="[email protected]" # 接收告警邮件的邮箱地址
SENDER_EMAIL="[email protected]" # 发送告警邮件的邮箱地址
ALERT_SUBJECT="AnQiCMS 服务启动失败告警" # 告警邮件主题
# ---------------------------------

LOG_FILE="$BINPATH/running.log"
CHECK_LOG="$BINPATH/check.log"
TIMESTAMP=$(date +'%Y-%m-%d %H:%M:%S')

# Check if AnQiCMS is already running
exists=`ps -ef | grep '\<anqicms\>' |grep -v grep |wc -l`
echo "$TIMESTAMP $BINNAME PID check: $exists" >> "$CHECK_LOG"
echo "PID $BINNAME check: $exists"

if [ $exists -eq 0 ]; then
    echo "$TIMESTAMP $BINNAME NOT running, attempting to start..." >> "$CHECK_LOG"
    echo "$BINNAME NOT running, attempting to start..."

    # Attempt to start AnQiCMS
    cd "$BINPATH" && nohup "$BINPATH/$BINNAME" >> "$LOG_FILE" 2>&1 &

    # Give AnQiCMS a moment to start up
    sleep 10

    # Re-check status after attempted start
    exists_after_start=`ps -ef | grep '\<anqicms\>' |grep -v grep |wc -l`
    echo "$TIMESTAMP Re-check $BINNAME PID: $exists_after_start" >> "$CHECK_LOG"
    echo "Re-check $BINNAME PID: $exists_after_start"

    if [ $exists_after_start -eq 0 ]; then
        ALERT_MESSAGE="AnQiCMS 服务在 $TIMESTAMP 尝试启动后仍然失败!\n"
        ALERT_MESSAGE+="请检查 $BINPATH 目录下的 '$LOG_FILE' 文件获取详细错误信息。\n"
        ALERT_MESSAGE+="服务器信息:$(hostname)\n"
        ALERT_MESSAGE+="启动脚本路径:$0\n\n"
        
        # Add last few lines of the running log to the email for quick diagnosis
        ALERT_MESSAGE+="最新的 $LOG_FILE 日志内容 (最近20行):\n"
        ALERT_MESSAGE+=$(tail -n 20 "$LOG_FILE" 2>&1 || echo "无法读取日志文件或日志文件为空。")
        
        echo -e "$ALERT_MESSAGE" | mail -s "$ALERT_SUBJECT" -r "$SENDER_EMAIL" "$RECIPIENT_EMAIL"
        echo "$TIMESTAMP AnQiCMS 启动失败,已发送告警邮件到 $RECIPIENT_EMAIL。" >> "$CHECK_LOG"
        echo "AnQiCMS 启动失败,已发送告警邮件。"
    else
        echo "$TIMESTAMP $BINNAME 成功启动。" >> "$CHECK_LOG"
        echo "$BINNAME 成功启动。"
    fi
else
    echo "$TIMESTAMP $BINNAME is already running." >> "$CHECK_LOG"
    echo "$BINNAME is already running."
fi

Script modification instructions:

  • Email configuration variablesIn the beginning of the script, I have added:RECIPIENT_EMAIL,SENDER_EMAIL,ALERT_SUBJECTThese variables, so you can modify them according to your actual situation for the recipient's email, sender's email, and email subject. Please make sure to replace:[email protected]and[email protected]with actual email addresses.
  • Log path and timestamp: DefinedLOG_FILEandCHECK_LOGThe variable points to the running log of AnQiCMS and the check log of the script itself, and introducesTIMESTAMPThe variable is used for logging.
  • Double state checkAfter the first judgment that AnQiCMS is not running and after trying to start it, we addedsleep 10(Wait for 10 seconds, you can adjust the waiting time according to your actual situation) andexists_after_startperform a second check on the variable. This second check is the key to determining whether the startup was successful.
  • Alert email content: If a second check still finds that AnQiCMS is not running, the script will construct an alert email.The email content includes the failure time, the suggested log file path to check, the server hostname, and the startup script path, which helps to quickly locate the problem.
  • Log fragment appended:tail -n 20 "$LOG_FILE"The command will fetch the last 20 lines of AnQiCMS runtime logs and attach them to the alert email.This is very helpful for preliminary diagnosis of problems, because the reasons for the startup failure are often reflected at the end of the log.
  • Send email:echo -e "$ALERT_MESSAGE" | mail -s "$ALERT_SUBJECT" -r "$SENDER_EMAIL" "$RECIPIENT_EMAIL"Command to send email.-sSpecify the subject,-rSpecify the sender.echo -eUsed to parse newline characters in email content.

Configure Cron tasks

Modified afterstart.shThe script still needs to be executedcronSchedule execution to achieve automatic checking and alerting. Typically, we configure it to run every minute.

Open the cron configuration file:

crontab -e

Add or modify a line similar to the following in the open editor:)

*/1 * * * * /bin/bash /www/wwwroot/anqicms/start.sh >> /www/wwwroot/anqicms/cron.log 2>&1

Make sure the path/www/wwwroot/anqicms/start.shmatches the actual installation path of your AnQiCMS.>> /www/wwwroot/anqicms/cron.log 2>&1isstart.shRedirect the standard output and error output of the script to a log file, which is very useful for debugging.start.shIt is very useful for monitoring the execution of the script.

Test alarm system

To verify that the alarm system is working properly, you can perform the following test steps:.

  1. Manually stop the AnQiCMS process.: Useps -ef | grep anqicmsFind the PID of the AnQiCMS process and then usekill -9 <PID>command to manually kill the process.
  2. Wait for the Cron task to execute: Wait for one minute, letcronTask triggered after modificationstart.shscript.
  3. Check emailCheck inRECIPIENT_EMAILthe email address you set, to see if you have received any alert emails.
  4. Check logsCheck.$BINPATH/check.logand$BINPATH/running.logFile, confirm that the execution records of the script and the startup log of AnQiCMS are generated correctly.

If everything is configured correctly, you should receive an alert email containing information about AnQiCMS startup failure and related log fragments. At the same time, due tostart.shThe script will attempt to restart AnQiCMS, after the alert, the service should try to start again.

Summary

By following these steps, we have established a robust self-monitoring and alerting mechanism for AnQiCMS.This mechanism can automatically detect the service status, attempt to recover automatically when an anomaly is found, and notify the operation team in a timely manner if the recovery fails.This means that as website operators, we can focus more comfortably on content creation and user experience, while entrusting the management of system stability to automated processes.


Common Questions and Answers (FAQ)

Q: My server does not havemailcommand, ormailHow can I set up email alerts if the command cannot send emails normally?

Answer: IfmailThe command is unavailable or the configuration is complex, you can consider using other methods to send emails. A common and flexible way is to use a Python script. You can instart.shThe script calls a simple Python script to send an email, which can use the built-in PythonsmtplibModule connects to your company's SMTP server or a public email service (such as Gmail, Outlook) to send emails. Another option is if your server has internet access, you can use some email service providers' HTTP APIs to send emails, usually throughcurlCommand can be executed. For example, instart.shadd apython /path/to/send_mail.py "subject" "body"call, or acurl ...a command.

Question: I am worried that frequent alert emails may cause email bombing, especially when the service is continuously unable to start. How can I optimize the alert frequency?

Answer: This is a very practical question. You can introduce a simple flood control mechanism to optimize the alarm frequency. For example,start.shIn the script, you can check a temporary file (for example,last_alert_time.txt)。If the time since the last alert was sent is less than X minutes (or hours) from now, do not send an alert email this time.Only send again after X minutes (or hours).This can prevent receiving a large number of duplicate emails during continuous failures, allowing you to focus more on solving the problem.At the same time, ensure that each alert email includes sufficient information, such as the count of multiple failed attempts or the latest error log.

What key information should an alert email contain to help me quickly locate the cause of AnQiCMS startup failure?

Answer: An effective alert email should contain key information that helps you quickly diagnose the problem. In addition to the timestamp, server hostname, startup script path, recommended log file paths, and log snippets mentioned in this article, you can also consider including the following information:

  • AnQiCMS Version Information: Helps to determine if the issue is related to a specific version.
  • Server Load Information: For exampleuptimeThe output of the command can help understand the overall system operation.
  • Disk space usage status:df -houtput, insufficient disk space may also cause startup failure.
  • Memory usage status:free -houtput, insufficient memory may also affect the startup of Go applications.
  • Database connection statusIf possible, try to execute a simple database connection test command in the script and include the result in the email, as a failure to connect to the database is a common reason for CMS startup failure.