In website content operation, we often need to display a partial content of articles, products, or single pages on list pages, summary areas, or specific blocks.This content is often rich text containing HTML tags, and simply truncating by character count may cause the HTML tags to be cut off, thereby destroying the layout and display effect of the page.For example, one<p>这是一个段落<strong id="test">中的加粗</str...</p>This content, if abruptly truncated, may cause the page to appear with unclosed<strong>tags, affecting the rendering of subsequent content.

To solve this difficult problem, the template engine of AnQiCMS (AnQiCMS) provides a namedtruncatechars_htmlThe powerful filter that can safely extract rich text content containing HTML tags and intelligently close all incomplete tags to ensure the integrity of the page structure.

The source of HTML content in AnQiCMS

Rich text content in AnQiCMS mainly comes from the following core fields:

  • Document content (Archive.Content): PassarchiveDetailTag retrieval, this is the main content of the article or product detail page.
  • Category Content: PasscategoryDetailTag retrieval, used to display the detailed introduction of the category.
  • Page Content: PasspageDetailTag retrieval, for example, "About Us",

These fields are usually generated by the backend rich text editor and contain various HTML tags such as<p>,<strong>,<a>,<img>Therefore, one must be particularly careful when cutting it.

Get to knowtruncatechars_htmlFilter

truncatechars_htmlThe filter is specifically designed to handle strings containing HTML tags. Its core advantage lies in, when you truncate text within a specified length, it willAutomatically detect and close all truncated HTML tagsThis means you don't have to worry about slicing operations causing page element disorder, or causing browser parsing errors due to unclosed tags.

With simple string slicing filters such astruncatechars) is different,truncatechars_htmlThe filter can “understand” the HTML structure. For example, a string containing<b>Bold text</b>if it is truncated in<b>Within the tag, it ensures that the final output HTML remains valid, for example<b>Bold...</b>Instead of leaving an open one<b>Label. The end of the truncated content is usually added an ellipsis (…) to indicate that the content has been truncated.

How to use safelytruncatechars_html

Usetruncatechars_htmlThe filter is very intuitive, but it is important to pay attention to a critical coordination operation:

{{ 你的HTML内容变量|truncatechars_html:截取长度|safe }}

Let us take a practical scenario as an example, suppose you need to display a brief summary of each article on the article list page, which is extracted from the complete content (archive.Content):

{% archiveList articles with type="page" limit="10" %}
    {% for article in articles %}
    <div class="article-item">
        <h3><a href="{{ article.Link }}">{{ article.Title }}</a></h3>
        <div class="article-summary">
            {# 假设 article.Content 包含 HTML 标签 #}
            {%- archiveDetail articleContent with name="Content" id=article.Id %}
            {{ articleContent|truncatechars_html:120|safe }}
            {%- endarchiveDetail %}
            <a href="{{ article.Link }}" class="read-more">阅读更多 &gt;</a>
        </div>
    </div>
    {% endfor %}
{% endarchiveList %}

In the code above:

  1. articleContentThe variable stores the complete HTML rich text content of the article.
  2. truncatechars_html:120WillarticleContentExtract approximately 120 characters (including HTML tags and ellipses) and automatically close any truncated tags.
  3. |safeThe filter is a crucial step. The Anqi CMS template engine, for security reasons, defaults to escaping all output content as HTML entities.This means, if you don't add|safeEven thoughtruncatechars_htmlGenerated correct HTML, these HTML tags will also be escaped.&lt;p&gt;/&lt;strong&gt;etc., will be displayed directly on the page, rather than rendered as the corresponding style by the browser. Therefore, intruncatechars_htmlafter|safeIt is essential, it tells the template engine that this part of the content is safe HTML and can be output directly.

Practical scenarios and advanced considerations

  • List page summary:This is the most common application, for example, to display an overview of content in blog article lists, product catalog pages, etc. By controlling截取长度,it can well adapt to different page layouts and design needs.
  • Word count optimization:The length of the excerpt should be determined according to the actual needs and design. Too long may lose the meaning of the abstract, and too short may make the content seem too fragmented.
  • truncatewords_htmlFilter:If you prefer to truncate by word count rather than character count and also maintain the integrity of the HTML structure, you can usetruncatewords_htmlthe filter. For example{{ articleContent|truncatewords_html:30|safe }}It will take about 30 words. The choice depends on your specific content and language habits.
  • Debugging and checking:If the extracted content does not display as expected, you can try usingdumpa filter to viewtruncatechars_htmlthe content and type of variables before and after filtering, which helps you locate the problem. For example{{ articleContent|dump }}Can print the original content to help check for format issues.
  • Default ellipsis:By default, it will be added after truncation....As an ellipsis. Although it is not explicitly stated in the Anqi CMS documentation that this ellipsis can be modified directly, it is usually a fixed behavior in most Django-like template engines.

By cleverly usingtruncatechars_htmlFilter, you can easily implement secure text extraction of rich text in the Anqi CMS template, which can ensure the beauty of the page and maintain the integrity of the HTML structure, thus providing users with a better browsing experience.

Frequently Asked Questions (FAQ)

1. Why did I usetruncatechars_htmlAfter the filter, the one displayed on the page is<p>内容...</p>This HTML tag text, not the formatted content?

This is because you might have forgotten totruncatechars_htmlAdd after the filter|safeThe filter. The Anqi CMS template engine, to prevent potential security risks (such as XSS attacks), defaults to escaping all HTML tags in the output content.|safeThe filter explicitly tells the template engine that this part of the content is HTML that has been processed safely, and can be rendered directly as HTML without escaping.

2.truncatechars_htmlandtruncatewords_htmlWhat are the differences between the filters? Which one should I choose?

The main difference lies in the unit of truncation:

  • truncatechars_html: PresscharacterNumber truncation. It counts a specified number of characters from the beginning of the text and truncates when the limit is reached.
  • truncatewords_html: PresswordsIt extracts the number of words. It identifies words in the text and truncates to the specified number of words.

Choose which depends on your content type and display needs. If your content is Chinese, it is usually cut by characters (truncatechars_html) is more common; if it is English content, it is cut by words (truncatewords_html)It may be easier to maintain the integrity of meaning, as there are natural spaces between English words.

3. The content truncated always ends with an ellipsis, can I change the ellipsis?

Based on the current document of Anqi CMS,truncatechars_htmlandtruncatewords_htmlThe ellipsis used by the filter is fixed “…”, and there is no parameter to directly modify this symbol.If you need a different ellipsis, you may need to process it by replacing other strings after the cut, but this will increase the complexity of the operation and may affect the security of HTML tag closures, which is usually not recommended.In most cases, the default ellipsis is able to convey the information that the content has been truncated.