In content operation, we often need to display a brief version of the content on list pages, aggregation pages, or article summary areas.This can not only optimize the page layout and improve the user experience, but also help search engines better understand the content theme to some extent.However, when content is written in Markdown format and finally rendered as HTML, some challenges may be encountered if it needs to be truncated.It is easy to truncate HTML content by characters or bytes, which can easily lead to incomplete tags, disordered page structure, and even display errors.

AnQiCMS as a feature-rich enterprise-level content management system, fully considers the needs of this type of content operation.It built-in powerful template engine, providing special filters for safely truncating HTML content. Even the HTML content rendered after Markdown is very complex, it can still ensure that the truncated code remains structurally complete and semantically correct.

The challenge of truncating HTML content

Imagine if your article content included<p><strong>这是<em>一段</em>加粗并斜体的文本</strong></p>Such an HTML snippet. If you simply truncate it to the first 10 characters, the result might be<p><strong>这是<em>一段</em>This is clearly an incomplete HTML structure. The browser may encounter unexpected layout issues or simply not display this part of the content.

The convenience of Markdown formatting makes content creation efficient, but the final rendered HTML requires higher requirements for truncation operations when displaying summaries.We need a method that can control the length of content, and can also intelligently recognize and close HTML tags to ensure that the truncated content is still a valid HTML fragment.

AnQiCMS solution:truncatewords_htmlFilter

The template engine of AnQiCMS providestruncatewords_htmlFilter, it is the core tool that solves the problem of "How to safely truncate HTML content rendered by Markdown to words."This filter is specifically designed to handle strings containing HTML tags, it will truncate according to the specified word count, and also intelligently check and automatically close any incomplete HTML tags to avoid damage to the page structure.

Further, since Markdown content in AnQiCMS needs to be rendered to HTML in order to perform HTML-level truncation, we first need to userenderThe filter converts Markdown to HTML.

The following are the steps to safely truncate the HTML content rendered by Markdown in the template:

  1. Get the original Markdown content:In AnQiCMS template, you can usearchiveListto loop through the article list, and theitem.ContentThe field usually contains original Markdown content.

  2. Render Markdown to HTML:UserenderThe filter willitem.ContentConvert Markdown text to HTML.

  3. Safely truncate HTML by word:Pass the rendered HTML content totruncatewords_htmlFilter, and specify the number of words you want to keep. For example,truncatewords_html:50Truncate content to about 50 words.

  4. Mark as safe HTML:Due totruncatewords_htmlReturns the HTML code, to avoid the template engine from escaping it again (which would cause the page to directly display the HTML tag text), you must add it after|safeFilter.safeThe filter tells the template engine that this content is processed HTML and can be output directly.

Below is an example of using it in the AnQiCMS templatetruncatewords_htmlAn example of a filter:

{% archiveList archives with type="page" limit="10" %}
    {% for item in archives %}
    <article>
        <h3><a href="{{item.Link}}">{{item.Title}}</a></h3>
        <div>
            {# 假设 item.Content 是原始 Markdown 内容 #}
            {# 先渲染Markdown为HTML,再按50个单词截断,并标记为安全HTML #}
            {{ item.Content|render|truncatewords_html:50|safe }}
            <a href="{{item.Link}}"> [阅读更多]</a>
        </div>
    </article>
    {% empty %}
    <p>暂时没有文章。</p>
    {% endfor %}
{% endarchiveList %}

In this code block:

  • item.ContentThe variable obtained is the Markdown formatted article content.
  • |renderThe filter is responsible for converting it to a standard HTML string.
  • |truncatewords_html:50The filter then operates on this HTML string, truncating it to about 50 words while ensuring the proper closure of HTML tags.
  • |safeThe filter ensures that the final HTML snippet is parsed directly by the browser, rather than displayed as plain text.

Other truncation options:truncatechars_html

If you prefer to control the summary length by character count (instead of word count), AnQiCMS also providestruncatechars_htmlthe filter. Its usage is similar totruncatewords_htmlSimilar, but the unit truncated is characters. Similarly, it needs to be used beforerenderMarkdown content, and finally added|safeFilter.

{% archiveList archives with type="page" limit="10" %}
    {% for item in archives %}
    <article>
        <h3><a href="{{item.Link}}">{{item.Title}}</a></h3>
        <div>
            {# 先渲染Markdown为HTML,再按150个字符截断,并标记为安全HTML #}
            {{ item.Content|render|truncatechars_html:150|safe }}
            <a href="{{item.Link}}"> [阅读更多]</a>
        </div>
    </article>
    {% empty %}
    <p>暂时没有文章。</p>
    {% endfor %}
{% endarchiveList %}

**Practice and Consideration

  • Truncate the length selection:50 words or 150 characters is a common abstract length, but**the length depends on your website design and content type.需要进行测试以找到最能吸引用户点击“阅读更多”的长度。
  • “Read more” link:Add a clear 'Read More' link after truncating content to guide users to the full article and enhance user experience.
  • SEO Influence:The abstract should be sufficiently rich to convey the core theme of the article. Although search engines will crawl the entire page, a good abstract has a positive impact on user stay time and click-through rate.
  • Performance:Although AnQiCMS is based on Go language and has excellent performance, excessive or complex template processing may still slightly increase rendering time.Select an appropriate truncation length and filter that meets the display requirements and maintains the page loading speed.

By using flexibilityrender/truncatewords_htmlandsafeThese powerful AnQiCMS template filters allow you to easily manage and display long HTML content rendered from Markdown, while ensuring the beauty of the page and user experience, also maintaining the integrity and security of the HTML structure.


Common Questions (FAQ)

Q1:truncatewords_htmlandtruncatechars_htmlWhat is the difference between these filters? Which one should I choose?

A1: truncatewords_htmlIt is truncated by word count, for exampletruncatewords_html:50It will extract about 50 words. This method is more in line with human reading habits and usually maintains the semantic integrity of the summary.truncatechars_htmlThis is truncated by character count, for example,truncatechars_html:150Will extract about 150 characters.Both will intelligently handle HTML tags to ensure the truncated HTML structure is valid.truncatewords_htmlIf you need to strictly control the visual length of the summary (for example, to adapt to a fixed width layout), thentruncatechars_htmlmay be more suitable.

Q2: Why is the content not displayed correctly after usingtruncatewords_htmlortruncatechars_htmlAfter the filter, it is also necessary to add|safeFilter?

A2:AnQiCMS's template engine, to prevent cross-site scripting attacks (XSS), defaults to escaping all output variable content to HTML. This means that,truncatewords_htmlortruncatechars_htmlReturns HTML code (including<p>/<strong>tags), the template engine will display the HTML tag text directly on the page<Converted to&lt;,>Converted to&gt;,instead of the parsed effect.|safeThe filter explicitly tells the template engine that this part of the content is "safe" HTML, which does not need to be escaped and can be output directly, thus ensuring that the browser can correctly render the truncated HTML.

Q3: What are the recommended standards or factors to consider when setting the truncation length?

A3:The length of truncation does not have an absolute standard, it should be decided based on your website design, content type, and target user experience. Generally speaking, for the abstracts of article list pages, 5