As a senior CMS website operation personnel for an information security company, I am well aware that the stability and reliability of website services are the cornerstone of successful content operation.AnQiCMS provides a solid foundation with its efficient features and concise architecture in the Go language, but ensuring the robustness of its startup and shutdown in actual deployment, especially process management, is the critical link that requires meticulous refinement.

Today's article, we will delve into how to introduce a more robust PID (Process ID) file management mechanism in the startup script of AnQiCMS, in order to avoid common issues such as process zombies, port occupation, or unexpected service interruptions, ensuring that our AnQiCMS site always runs stably.

Why is it necessary to have a more robust PID file management?

In the deployment practice of AnQiCMS, we may encounter such scenarios: the AnQiCMS service crashes unexpectedly, but the system mistakenly believes it is still running; or when trying to start a new instance, port conflicts may occur due to the old process not terminating completely.start.shreference it inps -ef | grepTo judge whether the process exists) it is simple, but it has certain limitations in complex or abnormal situations.

For example,grepThe command may misjudge: When other unrelated process names or parameters happen to contain "anqicms", it may incorrectly report that AnQiCMS is running, thereby preventing the startup of a new instance. In addition, if the service crashes,grepIt may not be recognized as a "zombie process" or a terminated but not cleaned-up process, which may cause the PID file to linger or be inaccurate, causing difficulties for subsequent operations.

PID file (Process ID file) is exactly for solving these problems.It is a simple text file used to store the unique identifier (PID) of the specific process running.Through the PID file, we can accurately trace and manage individual service instances, ensuring that each operation is based on the accurate process status.

AnQiCMS existing start/stop script review

Let's review the AnQiCMS providedstart.shandstop.shSimplified version of the script:

start.sh:

BINNAME=anqicms
BINPATH=/www/wwwroot/anqicms
exists=`ps -ef | grep '\<anqicms\>' |grep -v grep |wc -l`
if [ $exists -eq 0 ]; then
    # ... 启动AnQiCMS进程 ...
    cd $BINPATH && nohup $BINPATH/$BINNAME >> $BINPATH/running.log 2>&1 &
fi

This script mainlygrepCheck the process count, if it is 0, then start.

stop.sh:

BINNAME=anqicms
BINPATH="$( cd "$( dirname "$0"  )" && pwd  )"
exists=`ps -ef | grep '\<anqicms\>' |grep -v grep |awk '{printf $2}'`
if [ $exists -eq 0 ]; then
    # ... 未运行 ...
else
    kill -9 $exists # 直接强制终止
fi

Stopping the script also depends on itgrepGet PID and use itkill -9Force termination. This method may cause the AnQiCMS service to be unable to perform necessary resource cleanup, such as closing database connections, saving temporary data, etc.

Introducing a robust PID file management mechanism

To overcome the aforementioned limitations, we willstart.shandstop.shbe transformed, introducing the creation, verification, usage, and cleanup of PID files.

Core idea:

  1. At startupCheck PID file.If the file exists, read the PID and verify whether the process is really running.If running, refuse to start; if not running, consider it an old PID file and delete it.Then, start AnQiCMS and write its PID to a new PID file.
  2. At shutdownCheck the PID file.If the file exists, read the PID and verify whether the process is running.If running, attempt to send SIGHUP or SIGTERM signal (allow graceful shutdown), wait for a period of time; if it has not stopped, send SIGKILL (force termination) instead.Finally, delete the PID file regardless of success or failure.

1. Define the PID file path

Firstly, we need to specify a unique PID file path for each instance of AnQiCMS. Typically, we would place the PID file at the root of the AnQiCMS installation directory or in a specialrunthe directory.

PID_FILE="$BINPATH/anqicms.pid"

2. Transform the startup script (start.sh)

This versionstart.shWill be more intelligent, able to handle various situations such as the existence of PID files, the process is still running, or the PID file is outdated.

#!/bin/bash
### check and start AnqiCMS with robust PID management
# author fesion
# the bin name is anqicms

BINNAME=anqicms
BINPATH=/www/wwwroot/anqicms # 请根据实际路径修改
LOG_FILE="$BINPATH/running.log"
CHECK_LOG="$BINPATH/check.log"
PID_FILE="$BINPATH/$BINNAME.pid"

echo "$(date +'%Y%m%d %H:%M:%S') --- AnQiCMS startup script initiated ---" >> "$CHECK_LOG"

# 函数:检查PID是否正在运行
is_running() {
    local pid=$1
    if [ -z "$pid" ]; then
        return 1
    fi
    # kill -0 PID 不发送任何信号,但会检查是否存在该进程ID的进程
    kill -0 "$pid" > /dev/null 2>&1
    return $?
}

# 检查PID文件是否存在
if [ -f "$PID_FILE" ]; then
    CURRENT_PID=$(cat "$PID_FILE")
    echo "$(date +'%Y%m%d %H:%M:%S') PID file found: $PID_FILE, PID: $CURRENT_PID" >> "$CHECK_LOG"
    if is_running "$CURRENT_PID"; then
        echo "$(date +'%Y%m%d %H:%M:%S') AnQiCMS is already running with PID $CURRENT_PID. Exiting." >> "$CHECK_LOG"
        echo "AnQiCMS is already running with PID $CURRENT_PID. Exiting."
        exit 1 # 服务已经在运行,退出
    else
        echo "$(date +'%Y%m%d %H:%M:%S') Stale PID file found. Removing $PID_FILE." >> "$CHECK_LOG"
        rm -f "$PID_FILE" # PID文件存在但进程已死,删除旧文件
    fi
else
    echo "$(date +'%Y%m%d %H:%M:%S') PID file not found. Proceeding with startup." >> "$CHECK_LOG"
fi

# 启动AnQiCMS进程
echo "$(date +'%Y%m%d %H:%M:%S') Starting AnQiCMS..." >> "$CHECK_LOG"
cd "$BINPATH" && nohup "$BINPATH/$BINNAME" >> "$LOG_FILE" 2>&1 &
NEW_PID=$! # 获取后台启动进程的PID
echo "$NEW_PID" > "$PID_FILE" # 将PID写入文件

if is_running "$NEW_PID"; then
    echo "$(date +'%Y%m%d %H:%M:%S') AnQiCMS started successfully with PID $NEW_PID." >> "$CHECK_LOG"
    echo "AnQiCMS started successfully with PID $NEW_PID."
else
    echo "$(date +'%Y%m%d %H:%M:%S') Failed to start AnQiCMS." >> "$CHECK_LOG"
    echo "Failed to start AnQiCMS."
    rm -f "$PID_FILE" # 启动失败,清理PID文件
    exit 1
fi

Description:

  • is_runningFunction usagekill -0Check if the process exists, which isgrepit is more precise.
  • The script will first check the PID file and determine if the service is running based on the PID in the file.
  • If the PID file exists but the corresponding process has died, the script will automatically clean up this 'stale' PID file.
  • After successfully starting, the new process PID will be written.anqicms.pidfile.
  • The startup failure will also clean up the PID file.

3. Refactoring the stop script (stop.sh)

This versionstop.shIt will try to close gracefully first, only force termination after timeout.

`bash #!/bin/bash

stop AnqiCMS with robust PID management

author fesion

the bin name is anqicms

BINNAME=anqicms BINPATH=”\(( cd "\)'} ]( dirname “}]\(0" )" && pwd )" # Get the script directory CHECK_LOG="\)BINPATH/check.log PID_FILE=\(BINPATH/\)BINNAME.pid GRACEFUL_TIMEOUT=10 # English graceful shutdown wait seconds

echo “English”}]\((date +'%Y%m%d %H:%M:%S') --- AnQiCMS stop script initiated ---" >> "\) --- EnglishCHECK_LOG

Function: Check if PID is running

is_running() {

local pid=$1
if [ -z "$pid" ]; then
    return 1
fi
kill -0 "$pid" > /dev/null 2>&1
return $?

}

Check if PID file exists

if [ -f "$PID_FILE" ]; then

TARGET_PID=$(cat "$PID_FILE")
echo "$(date +'%Y%m%d %H:%M:%S') PID file found: $PID_FILE, PID: $TARGET_PID" >> "$CHECK_LOG"

if is_running "$TARGET_PID"; then
    echo "$(date +'%Y%m%d %H:%M:%S') Attempting graceful shutdown for AnQiCMS (PID: $TARGET_PID)..." >> "$CHECK_LOG"
    kill "$TARGET_PID" # 发送SIGTERM信号 (15),尝试优雅关闭

    # 等待进程优雅关闭
    for i in $(seq 1 $GRACEFUL_TIMEOUT); do
        if ! is_running "$TARGET_PID"; then
            echo "$(date +'%Y%m%d %H:%M:%S') AnQiCMS (PID: $TARGET_PID) stopped gracefully." >> "$CHECK_LOG"
            break
        fi
        sleep 1
    done

    if is_running "$TARGET_PID"; then
        echo "$(date +'%Y%m%d %H:%M:%S') AnQiCMS (PID: $TARGET_PID) did not stop gracefully within $GRACEFUL_TIMEOUT seconds. Forcing shutdown..." >> "$CHECK_LOG"
        kill -9 "$TARGET_PID" # 发送SIGKILL信号 (9),强制终止
        sleep 1 # 确保进程有时间被系统终止
        if ! is_running "$TARGET_PID"; then
            echo "$(date +'%Y%m