In website operation, the security and display effect of user-uploaded content are crucial.Especially when users have the opportunity to submit content containing HTML tags, how to effectively carry out preliminary purification, prevent malicious code injection (XSS attacks), and ensure that the content is presented as expected, is a challenge that every operator needs to face.AnQiCMS as a system focusing on efficiency and security, provides us with a variety of tools to meet this need.

This article will delve into how to use the built-in features and powerful template filters of AnQiCMS to preliminarily clean the HTML content uploaded by users, in order to ensure the safety and standardization of the website.

【en】One, control from the source: preliminary prevention of backend content settings

AnQiCMS provides some global settings on the backend, which can be used for preliminary filtering and standardization during the content publishing process, helping to reduce the workload of complex purification on the template level.

In the AnQiCMS backend, we can find the 'Content Settings' option. Here, there are several key settings that have a direct impact on the purification of HTML content:

  1. Whether to automatically filter external linksThis option helps us manage external links included in user content.This feature is very useful if we do not want users to add external links freely in the content, or if we want to handle external links uniformly.rel="nofollow"Property, or directly remove these links, which can prevent the spread of spammy or malicious links to some extent.

These settings provide a basic level of protection from the source of release, which is the first step in content security policy.

II. Refining the Details: Using the AnQiCMS template filter for deep purification.

Even after backend preliminary filtering, the HTML content uploaded by users may still contain irregular or potentially harmful tags.At this moment, the powerful template filter of AnQiCMS can come into play, performing detailed purification processing before the content is finally presented to the user.Understanding the functions and usage scenarios of these filters is the key to ensuring content safety and aesthetics.

AnQiCMS template defaults to automatically escaping all output content, which means that like<script>such HTML tags will be converted to&lt;script&gt;Thus, it is displayed in plain text format to prevent the browser from executing malicious code. However, if we want to retain some HTML tags (such as<strong>/<p>),need to be used with specific filters.

1. Remove all HTML tags:)striptags

When you want the HTML content uploaded by users to be displayed only in plain text form,striptagsThe filter is the most direct and effective choice. It removes all HTML tags from the content, leaving only the text.

Use Cases:

  • When displaying content summaries on the article list page, only plain text should be retained to avoid complex styles affecting the layout.
  • User comments or messages should only be allowed to submit plain text to ensure maximum safety.

Example Assumption:archive.ContentIs the HTML content uploaded by the user, we want to convert it into a plain text summary:

{{ archive.Content|striptags }}

This is,<p>这是一段<b>重要的</b>信息<script>alert('xss');</script></p>will become这是一段重要的信息.

2. Precise control: Remove specified HTML tags:removetags

If your business requirements allow users to use some HTML tags (such as<strong>/<em>/<p>), but need to prohibit some dangerous tags (such as<script>/<iframe>)removetagsThe filter provides finer control. It allows you to specify a list of HTML tags to be removed.

Use Cases:

  • Allow users to include basic formatting tags in their content, but prohibit embedding scripts, frames, or styles.
  • Need to clear redundant or unsafe tags left by a specific third-party rich text editor.

Example We hope to retain paragraphs and bold text, but remove script and image tags:

{{ archive.Content|removetags:"script,iframe,img"|safe }}

Important Notice When you useremovetagsAfter removing unnecessary tags, if you want the remaining valid HTML tags to be normally parsed and rendered by the browser,they must be followed by|safeFilterBecause AnQiCMS's default automatic escaping mechanism will escape all HTML tags to entity characters.|safeTell AnQiCMS that this part of the content has been sanitized and can be safely output as HTML.

3. Display the original code: Escape HTML special characters:escape(or)e)

Although AnQiCMS defaults to enabling automatic escaping, in some special cases, you may need to manually force the escaping of HTML content, or when you use|safeWhen thinking of escaping again.escape(or its abbreviation)e) Filter can convert special characters in HTML (such as)</>/&/"/') to the corresponding HTML entities.

Use Cases:

  • Display the user-submitted HTML code snippets on the page (instead of rendering them).
  • Add manually as a safety measure before any HTML output that is uncertain whether it is safe.

Example:

{{ "<strong>Hello!</strong><script>alert('xss');</script>"|e }}

This will escape all tags and special characters, displaying them in the end&lt;strong&gt;Hello!&lt;/strong&gt;&lt;script&gt;alert(&#39;xss&#39;);&lt;/script&gt;What the user sees is the code, not the rendered effect or popup.

4. Use with caution: Disable automatic escaping:safe

safeThe filter is a powerful feature in AnQiCMS but needs to be used with great caution.The function is to disable the automatic HTML escaping feature of the current expression, forcing the content to be output as HTML.

Use Cases:

  • When the content source is absolutely trustworthy, or has been strictly sanitized on the server side.
  • WithremovetagsCombined with filters, it allows the legal HTML that has been filtered to be rendered.

Example:

{{ my_trustworthy_html_content|safe }}

Severe warning:Do not use any unfiltered user-uploaded HTML content directly with|safethe filter togetherThis will directly lead to XSS (Cross-Site Scripting) vulnerabilities, allowing malicious users to inject and execute JavaScript code, seriously危害ing the security of the website and user data.

5. Other auxiliary purification filters

In addition to the filters mainly used for HTML content purification mentioned above, AnQiCMS also provides some other filters that are helpful for content processing:

  • urlize/urlizetruncThese filters can automatically convert URL strings in text to clickable hyperlinks. Although not direct 'cleaning', they help standardize the display of URLs and will automatically add linksrel="nofollow"Properties, helpful for SEO and preventing malicious redirects.
  • escapejsIf you need to embed user-provided data in JavaScript code, this filter can escape the special characters in JavaScript to prevent JS injection. But for purifying HTML content itself, it is usually more recommendedremovetagsorstriptags.

Three, Comprehensive Application and **Practice

Content purification is not a one-time solution, but requires the combination of various strategies:

  1. Layered DefenseFirstly, use the 'Content Settings' in the AnQiCMS backend for preliminary screening. Secondly, on the template output level, flexibly use according to specific requirements.striptags/removetags,