In website operation, the security and display effect of user-uploaded content are crucial.Especially when users have the opportunity to submit content containing HTML tags, how to effectively carry out preliminary purification, prevent malicious code injection (XSS attacks), and ensure that the content is presented as expected, is a challenge that each operator needs to face.AnQiCMS as an efficient and secure system provides us with a variety of tools to meet this demand.

This article will delve into how to use the built-in features and powerful template filters of AnQiCMS to preliminary clean the HTML content uploaded by users, in order to ensure the safety and specification of the website's content.

One, control from the source: preliminary prevention of backend content settings

AnQiCMS provides some global settings on the backend, which can be used for preliminary filtering and standardization during the content publishing process, helping to reduce the workload of complex purification on the template level.

In the AnQiCMS backend, we can find the 'Content Settings' option. Here, there are several key settings that have a direct impact on the purification of HTML content:

  1. Do you want to automatically filter external linksThis option can help us manage external links contained in user content.If we do not want users to add external links randomly in the content, or if we want to unify the handling of external links, this feature is very useful.After turning on, the system will automatically detect and process external links, such as adding to themrel="nofollow"Properties, or directly remove these links, which can prevent the spread of spammy external links or malicious links to some extent.

These settings provide a basic level of protection from the source of publication and are the first step in content security strategy.

Second, fine-tuning: Using AnQiCMS template filters for deep purification.

Even after preliminary filtering on the backend, the HTML content uploaded by the user may still contain irregular or potentially harmful tags.At this time, the powerful template filter of AnQiCMS can be put to use, performing detailed purification processing before the content is finally presented to the user.Understanding the role and usage scenarios of these filters is crucial for ensuring content safety and aesthetics.

The AnQiCMS template defaults to automatically escaping all output content, which means that tags like<script>are converted to&lt;script&gt;Thus, it is displayed in plain text format to prevent the browser from executing malicious code. However, if we want to retain some HTML tags (such as<strong>/<p>It requires the use of specific filters.

1. Remove all HTML tags:striptags

When you want the HTML content uploaded by users to be displayed only as plain text,striptagsThe filter is the most direct and effective choice. It removes all HTML tags from the content, leaving only text.

Usage scenario:

  • When displaying the content summary on the article list page, only plain text should be retained to avoid complex styles affecting the layout.
  • In order to ensure maximum safety, only plain text submissions are allowed for user comments or messages.

Example:archive.ContentIs the HTML content uploaded by the user, we want to convert it into a plain text summary:

{{ archive.Content|striptags }}

Thus,<p>这是一段<b>重要的</b>信息<script>alert('xss');</script></p>will become这是一段重要的信息.

2. Precise control: Remove specified HTML tags:removetags

If your business needs allow users to use some HTML tags (such as<strong>/<em>/<p>), but need to ban some dangerous tags (such as<script>/<iframe>)removetagsThe filter provides finer control. It allows you to specify the list of HTML tags to be removed.

Usage scenario:

  • Allow users to include basic formatting tags in their content, but prohibit the embedding of scripts, frames, or styles.
  • Need to clear redundant or unsafe tags left by a specific third-party rich text editor.

Example We hope to retain paragraphs and bold, but remove script and image tags:

{{ archive.Content|removetags:"script,iframe,img"|safe }}

Important reminderWhen you useremovetagsAfter removing the unnecessary tags, if you want the remaining valid HTML tags to be normally parsed and rendered by the browser,it must be added after.|safeFilter. Because AnQiCMS' default automatic escaping mechanism will escape all HTML tags into entity characters.|safeTell AnQiCMS that this part of the content has been purified and can be safely output as HTML.

3. Display the original code: Escape HTML special characters:escape(or)e)

Although AnQiCMS is enabled by default to automatically escape, in some special cases, you may need to manually force the escaping of HTML content, or when you use|safeWhen you want to escape again.escape(or its abbreviation)eThe filter can convert special characters in HTML (such as</>/&/"/') to their corresponding HTML entities.

Usage scenario:

  • Display the HTML code snippet submitted by the user on the page (instead of rendering it).
  • Add manually as a safety line before any HTML output that is uncertain whether it is safe.

Example:

{{ "<strong>Hello!</strong><script>alert('xss');</script>"|e }}

This will escape all tags and special characters, and finally display as&lt;strong&gt;Hello!&lt;/strong&gt;&lt;script&gt;alert(&#39;xss&#39;);&lt;/script&gt;The user sees the code, not the rendered effect or popup.

4. Use with caution: Disable automatic escaping:safe

safeThe filter is a powerful feature in AnQiCMS that needs to be used with great caution.The purpose is to disable the automatic HTML escaping feature of the current expression, forcing the content to be output as HTML.

Usage scenario:

  • When the content source is absolutely可信, or has been strictly sanitized on the server side.
  • withremovetagsUsed in conjunction with filters, it allows the legitimate HTML that has been filtered to be rendered.

Example:

{{ my_trustworthy_html_content|safe }}

Severe warning:Do not use any unfiltered user-uploaded HTML content directly with|safethe filterThis will directly lead to an XSS (Cross-site Scripting) vulnerability, allowing malicious users to inject and execute JavaScript code, severely harming the website and user data security.

5. Other auxiliary purification filters

In addition to the above filters mainly used for HTML content purification, AnQiCMS also provides some other filters that are helpful for content processing:

  • urlize/urlizetruncThese filters can automatically convert URL strings in text to clickable hyperlinks. Although not direct 'cleaning', they help standardize the display of URLs and will automatically add links torel="nofollow"The attribute has some help for SEO and preventing malicious jumps.
  • escapejsIf you need to embed user-provided data in JavaScript code, this filter can escape special characters in JavaScript to prevent JS injection. However, it is usually recommended for purifying HTML content itself.removetagsorstriptags.

Three, comprehensive application and practice

Content purification is not a one-time solution, but requires a combination of various strategies:

  1. Layered defenseFirstly, use the "Content Settings" on AnQiCMS backend for preliminary filtering. Secondly, at the template output level, flexibly apply according to specific needs.striptags/removetags、`