Sensitivity word filtering is a technology implemented in websites, applications, or platforms to carry out content review, used to prevent users from posting content that contains inappropriate, illegal, or content that does not comply with policies.We often need to worry about certain users' posts containing sensitive words during the actual operation of our website, these words often lead to our website being reported by users, even being banned by server operators, investigated by relevant departments, and fined.To prevent this situation from happening, we need to filter sensitive words.Implementation of sensitive word filtering involves multiple steps, including technical implementation and strategy formulation
Sensitive word filtering is a technology implemented in websites, applications, or platforms to conduct content review, used to prevent users from posting content that contains inappropriate, illegal, or content that does not comply with policies.We often need to worry about certain users posting content that contains sensitive words in the actual operation of our website, these words often lead to our website being reported by users, even being banned by the server operator, being summoned by relevant departments, and fined.To prevent this situation from occurring, we need to filter sensitive words.
The implementation of sensitive word filtering involves multiple steps, including both technical implementation and strategy formulation. The following takes the sensitive word filtering design of Anqi CMS as an example for elaboration.

func ReplaceSensitiveWords(content []byte, sensitiveWords []string) []byte {
// 如果敏感词库为空,或内容为空,直接返回
if len(sensitiveWords) == 0 || len(content) == 0 {
return content
}
// 顶一个结构体,用于存储替换结果
type replaceType struct {
Key []byte
Value []byte
}
var replacedMatch []*replaceType
numCount := 0
//忽略所有html标签的属性,这是为了防止将标签属性替换成为*,导致页面出错
reg, _ := regexp.Compile("(?i)<!?/?[a-z0-9-]+(\\s+[^>]+)?>")
content = reg.ReplaceAllFunc(content, func(s []byte) []byte {
key := []byte(fmt.Sprintf("{$%d}", numCount))
replacedMatch = append(replacedMatch, &replaceType{
Key: key,
Value: s,
})
numCount++
return key
})
// 替换所有敏感词为星号
for _, word := range sensitiveWords {
if len(word) == 0 {
continue
}
if bytes.Contains(content, []byte(word)) {
content = bytes.ReplaceAll(content, []byte(word), bytes.Repeat([]byte("*"), utf8.RuneCountInString(word)))
} else {
// 支持正则表达式替换,定义正则表达式以{开头}结束,如:{[1-9]\d{4,10}}
if strings.HasPrefix(word, "{") && strings.HasSuffix(word, "}") && len(word) > 2 {
// 移除首尾花括号
newWord := word[1 : len(word)-1]
re, err := regexp.Compile(newWord)
if err == nil {
content = re.ReplaceAll(content, bytes.Repeat([]byte("*"), utf8.RuneCountInString(word)))
}
continue
}
}
}
// 将上面忽略的html标签属性还原回来
for i := len(replacedMatch) - 1; i >= 0; i-- {
content = bytes.Replace(content, replacedMatch[i].Key, replacedMatch[i].Value, 1)
}
return content
}func (s *DjangoEngine) ExecuteWriter(w io.Writer, filename string, _ string, bindingData interface{}) error {
// 如果开启了debug模式,每次渲染的时候,重新解析模板。
if s.reload {
if err := s.LoadStart(true); err != nil {
return err
}
}
ctx := w.(iris.Context)
currentSite := provider.CurrentSite(ctx)
if tmpl := s.fromCache(currentSite.Id, filename); tmpl != nil {
data, err := tmpl.ExecuteBytes(getPongoContext(bindingData))
if err != nil {
return err
}
// 对data进行敏感词替换
data = currentSite.ReplaceSensitiveWords(data)
buf := bytes.NewBuffer(data)
_, err = buf.WriteTo(w)
return err
}
// 如果模板不存在,返回错误
return view2.ErrNotExist{Name: filename, IsLayout: false, Data: bindingData}
}
The思路 and practice of sensitive word filtering. In actual use, we should optimize and adjust according to actual needs.On the basis of automatic filtering by machines, increase the manual review of some content, carry out regular inspections, especially those that are easy to produce ambiguity or involve in-depth semantic analysis.
Sensitive word filtering is a complex and dynamic process that requires efficient technical means as well as flexible and adaptive strategies to adapt to the constantly changing language environment and policy requirements.Hope this content helps you.
Add site Add an AnQi CMS site in the "Site Management", select "AnQi CMS" in the pop-up interface for selecting site type.Fill in the website name, such as: My Security Site; Fill in the website address, such as: http://www.mycms.com ;Enter the communication Token, Token from the security site backend ->Function management -> Get from content import interface function.Select the publishing site Go to the Txt article publishing interface Click the add site button on the left Select the site to be published
Although there are not many websites still using website message and form collection features, a small number of users are still using them.Have you ever thought about how to automatically send new messages or new form submissions from users to our QQ email when the website receives them?To be honest, many people cannot log in to the website backend frequently, but QQ and WeChat are always online, and they can receive email notifications in time when new emails arrive.If you can send website messages and form information to the email in time, you won't miss any customer information.
To display the current time in AnQi CMS template, you can use the now tag.The `now` tag allows you to format and display the current time according to the Go language's time formatting string. Use the now tag in the template, you can call the now tag in the following way: {% now "2006-01-02 15:04:05" %} This will display the current time, in the format of Year-Month-Day Hour:Minute:Second .Go Language's Time Formatting String Go