When using AnQiCMS for website content management and template development, we may encounter some confusing display issues, one of which is the double escaping of HTML entities. This usually manifests as HTML tags that should be displayed as formatted text on the page becoming<p>Such visible characters, even worse.&lt;p&gt;This phenomenon not only affects the visual effect of the website, but may also cause the content to lose its original style.Deeply understanding the escaping mechanism of Anqi CMS template engine (Pongo2) can effectively avoid and solve this problem.
Firstly, we need to understand why template engines need to escape content.This is mainly for security considerations, especially to prevent cross-site scripting (XSS) attacks.Imagine if a user entered malicious content in the comment box<script>alert('XSS攻击!')</script>Code, and if the template engine outputs it directly to the page, then this script will be executed in other users' browsers, causing a security vulnerability.In order to avoid this situation, the template engine of Anqi CMS defaults to treating all output content with great caution.It will convert special characters in HTML such as</>、`、"、'等,自动转换为它们对应的HTML实体,例如<会变成<,>会变成>`This default automatic escaping mechanism is an important safeguard for website security.
How does the issue of double escaping arise? It usually occurs when the content is entered in the rich text editor in the background or when text containing HTML tags is imported from the outside, and the content itself may already contain HTML tags.For example, you enter a paragraph in the editor, and it may be stored in the database as<p>这是一个段落</p>Or, for some reason, it may even have been initially escaped.<p>这是一个段落</p>.
When this content, which already includes (or has been initially escaped) HTML entities, is read and output by the safe CMS template engine, if it encounters<p>Such a string, the template engine considers it just a piece of plain text, not an HTML tag that needs to be parsed. Therefore, it will perform the escaping operation again, transforming&Character conversion to&. This way, the original<p>is changed to&lt;p&gt;. If you manually add this to the output variable,|escapeThe filter, which is even more so, because the document clearly states that it will be automatically escaped by default, here it is used|escapeThis will cause the content to be escaped twice, even three times. This is the root cause of the page appearing&lt;p&gt;This is the fundamental reason why this HTML entity looks like garbled code.
To resolve this double escaping issue, we need to explicitly tell the template engine which content is trusted HTML code and does not need to be escaped again. Anqi CMS provides several methods to handle this situation:
Use
|safeFilter:This is the most commonly used and direct solution. When you are sure that the content contained in a variable has been safely reviewed HTML code, and you want it to be rendered in HTML format, you can use it.|safeFilter. For example, if you output the article content on the document detail page, you can write it like this:{{ articleContent|safe }}This filter will inform the template engine: "This sectionarticleContentThe content within the variable is safe HTML code, please output it directly without any escaping. But please remember,|safeThe filter will bypass the default automatic escaping mechanism, therefore,Content should only be used from sources you completely trustTo avoid potential XSS risksUse
{% autoescape %}Tags:If you need to control the automatic escaping behavior in a specific code block,autoescapeThe label will be very useful. You can choose to turn off or on the automatic escaping in a certain area.- To turn off automatic escaping:
{% autoescape off %}and{% endautoescape %}All content between them will not be automatically escaped.{% autoescape off %}{{ some_html_content_variable }}{% endautoescape %} - Enable Auto-escape:By default, it is enabled, but if you want to re-enable it after it has been turned off in a certain area, you can use
{% autoescape on %}This method is suitable for scenarios where you need more fine-grained control over escaping, such as mixing escaped and unescaped content in a template section.
- To turn off automatic escaping:
In summary, when you find HTML entities double-escaped on the Anqi CMS website, it usually means that the template engine is overly 'diligent' in protecting your content. By using it appropriately,|safeThe filter is used to process the HTML content you trust, or use{% autoescape %}Label to control the escape behavior of specific areas, you can ensure that the content is displayed correctly as expected while continuing to enjoy the security protection provided by Anq CMS.The key is to understand the default escaping mechanism and to handle it in a targeted manner based on the security of the content source.
Frequently Asked Questions (FAQ)
When should it be used
|safeFilter?Answer:|safeThe filter should be used when you are sure that the content of a variable is safe, harmless HTML code and you want it to be normally parsed and rendered by the browser.The most typical scenario is the content of articles and product descriptions edited in the background rich text editor, as this content is usually input by administrators and is considered trustworthy.Use with caution, avoid marking unprocessed content from user input (such as comments, messages) assafeOtherwise, it may bring XSS security risks.Since Anqi CMS defaults to automatically escaping, then
|escapeWhat is the use of the filter?Answer: The default automatic escaping in AnQi CMS template engine is designed to simplify development and ensure safety. However,|escapefilters still have their specific uses. For example, when you are{% autoescape off %}When it is explicitly necessary to escape a variable in HTML,|escapeIt comes into play. It allows you to manually and selectively escape specific content with the default escaping turned off, thus achieving more flexible control.But if used in a default automatic escaping environment, it will indeed cause double escaping.I saw such characters appear on the page
&amp;lt;p&amp;gt;What's the matter?Answer: This usually means that your HTML entity has been escaped more than twice. The first time may be that the content itself contains or has been initially escaped.<p>The second time is the template engine's default auto-escape that converts it to&lt;p&gt;. If additional usage is used in the code|escapeThe filter, or the content has been processed through multiple layers, it may lead to the occurrence of&amp;lt;p&amp;gt;This is a three-level escape situation. When troubleshooting, check the original source of the content, the template output code, and whether multiple escape operations or filter chains have been applied.The solution is usually to ensure that this type of content is used only once|safeFilter, or at the appropriate location{% autoescape off %}.