How to safely display HTML tags in a template without being parsed by the browser?

In website content management, we often encounter the need to display HTML code snippets, such as code examples in tutorials, and language annotations in technical articles.However, if these contents containing HTML tags are output directly to the page, the browser will parse and render them, rather than displaying them as plain text.<script>alert('XSS')</script>It will trigger a cross-site scripting (XSS) attack, posing a huge security risk to the website.

The template engine design of AnQiCMS (AnQiCMS) fully considers this point, it provides a set of secure and flexible mechanisms to handle the display of HTML content.Understanding these mechanisms can help us safely display content while also effectively preventing potential security risks.

Understand the default behavior of the template: automatic escaping.

AnQiCMS's template engine, similar to mainstream template engines like Django, defaults to processing all through{{ 变量 }}The content output in this wayAutomatically escape HTMLThis means that when template variables contain such as</>/&/"/'etc. special HTML characters, they are automatically converted to the corresponding HTML entities, such as<Will become&lt;,>Will become&gt;.

This default escaping behavior is the first line of defense for website security, ensuring that even if users unintentionally or maliciously insert HTML tags or JavaScript code when submitting content, these contents will be "harmless" processed and displayed as plain text on the page without being parsed and executed by the browser.

For example, if you enter the following content in a field in the background:

<p>这是一段包含<strong>粗体</strong>文字的HTML。</p>
<script>alert('Hello, AnQiCMS!');</script>

and try to use it in a template:{{ archive.Content }}then what you actually see in the browser is:

<p>这是一段包含<strong>粗体</strong>文字的HTML。</p>
<script>alert('Hello, AnQiCMS!');</script>

It is not a rendered paragraph and a pop-up. This is the automatic escaping in action.

Explicit escaping:escapeFilter

Although the default automatic escaping has provided good protection, in certain specific scenarios, we may need to explicitly indicate to the template engine that it should escape a variable as HTML, especially when dealing with complex data structures or when overriding other possible behaviors. At this point, you can useescapefilter.

escapeThe filter's role is to convert special HTML characters in the content to HTML entities. Its usage is very intuitive, just add it after the variable name|escapeand it is done:

{# 假设 raw_html_string 变量包含未经处理的HTML内容 #}
<div>
    我们想显示这段代码:<code>{{ raw_html_string|escape }}</code>
</div>

For example, ifraw_html_stringThe value is<img src="x" onerror="alert('XSS')">, then through{{ raw_html_string|escape }}After output, the content will be displayed on the page<img src="x" onerror="alert('XSS')">This plain text, rather than an image element that may trigger an attack.

escapeThe filter is a critical tool to ensure that any content that may contain HTML characters is displayed in plain text form, working in collaboration with the default auto-escape to provide double protection for content display.

Clean as needed:striptagsandremovetagsFilter

Sometimes, our need is not to display HTML tags, but to completelyremovethem, only retaining plain text content. AnQiCMS providesstriptagsandremovetagsTwo filters are used to achieve this.

  • striptagsFilter: As the name implies, it will remove all HTML tags from the content, leaving only the text within the tags. This is very useful in scenarios such as extracting article summaries, generating plain text descriptions, etc.

    {# 假设 article.Description 包含 HTML 内容 #}
    <p>文章摘要:{{ article.Description|striptags }}</p>
    

    Ifarticle.DescriptionIs<span>这是一段<em>加粗</em>的描述。</span>then{{ article.Description|striptags }}will output这是一段加粗的描述。.

  • removetagsFilterIf you only need to remove specific HTML tags instead of all,removetagsthe filter can be used. It accepts a comma-separated list of tag names as parameters.

    {# 移除内容中的 <b> 和 <i> 标签 #}
    <div>
        过滤后的内容:{{ some_html_content|removetags:"b,i"|safe }}
    </div>
    

    It should be noted that,striptagsandremovetagsThe purpose is to clean up HTML, rather than display it as text.In practical applications, they are often used to preprocess user-generated content to meet page display or search engine crawling requirements.

Summary and **practice**

In AnQiCMS template, to safely display HTML tags without being parsed by the browser, the core is to use the template engine'sautomatic HTML escaping mechanismAnd thenescapeFilterPerform clear, mandatory escaping.

Here are some **practical suggestions:

  1. Trust default escapingFor all content retrieved from the backend or user input, unless you explicitly know its source is safe and intended to be rendered as HTML, you should trust the default escaping behavior of the template engine.
  2. Use actively|escapeWhen you need to display code snippets, HTML examples, or any text that may contain special HTML characters, use actively{{ 变量|escape }}Filter to ensure they are displayed as plain text.
  3. Use with caution.|safeConversely, if you want the HTML content in a variable to be parsed and rendered by the browser (for example, when editing rich text in a backend editor), you need to use{{ 变量|safe }}Filter. But please ensure that these contents are from reliable sources and have been strictly filtered and verified, as abuse|safeis the main cause of XSS attacks.
  4. Utilizestriptagsandremovetagsclean the contentWhen the goal is to completely remove HTML tags to obtain plain text, these two filters are your good assistants.

By reasonably using these template functions, you can easily achieve the safe display of HTML tags in AnQiCMS, while maintaining the overall security of the website.


Frequently Asked Questions (FAQ)

Q1: Why did I output in the template<p>Hello</p>, but the browser didn't display a paragraph?A1: This is because of AnQiCMS template