In content operation, we often need to display a brief version of the content on list pages, aggregation pages, or article summary areas.This not only optimizes the page layout and improves the user experience, but also helps search engines better understand the content theme to some extent.However, when content is written in Markdown format and finally rendered as HTML, if it needs to be truncated, it may encounter some challenges.Easily truncating HTML content by character or byte can easily lead to incomplete tags, disordered page structure, and even display errors.

AnQiCMS as a feature-rich enterprise-level content management system, fully considers the needs of this type of content operation.It has a powerful template engine, providing a special filter for safely truncating HTML content, even if the HTML content rendered after Markdown is very complex, it can still ensure that the truncated code remains structurally complete and semantically correct.

The Challenge of Truncating HTML Content

Imagine if your article content includes<p><strong>这是<em>一段</em>加粗并斜体的文本</strong></p>such an HTML snippet. If you simply truncate it to the first 10 characters, the result might be<p><strong>这是<em>一段</em>This is clearly an incomplete HTML structure. The browser may encounter unexpected layout issues when parsing or may simply not display this part of the content.

The convenience of Markdown formatting makes content creation efficient, but it also raises higher requirements for truncation operations when displaying summaries.We need a method that can control the length of the content and intelligently recognize and close HTML tags to ensure that the truncated content is still a valid HTML fragment.

AnQiCMS's solution:truncatewords_htmlFilter

AnQiCMS provides a template engine that offerstruncatewords_htmlA filter that is the core tool to solve the problem of "How to safely truncate the HTML content rendered by Markdown after it is too long, how to truncate it by word?This filter is specifically designed to handle strings containing HTML tags, it will truncate according to the specified word count, and at the same time, it will intelligently check and automatically close any incomplete HTML tags to avoid damaging the page structure.

Further, since Markdown content in AnQiCMS needs to be rendered into HTML in order to be truncated at the HTML level, we first need to userenderThe filter converts Markdown to HTML.

The following are the steps to safely truncate the rendered HTML content in the template.

  1. Get the original Markdown content:In the AnQiCMS template, you can use:archiveListTo loop through the article list with the tag,item.ContentThe field usually contains original Markdown content.

  2. Render Markdown to HTML:UserenderThe filter willitem.ContentConvert Markdown text to HTML.

  3. Safely split HTML by words:Pass the rendered HTML content totruncatewords_htmlFilter and specify the number of words you want to keep. For example,truncatewords_html:50Truncate the content to about 50 words.

  4. Mark as safe HTML:due totruncatewords_htmlIt returns HTML code, in order to prevent the template engine from escaping it again (which would cause the HTML tags to be displayed directly on the page), you must add it after|safefilter.safeThe filter tells the template engine that this content is processed HTML and can be output directly.

Below is an example of using the filter in an AnQiCMS template.truncatewords_htmlExample of a filter:

{% archiveList archives with type="page" limit="10" %}
    {% for item in archives %}
    <article>
        <h3><a href="{{item.Link}}">{{item.Title}}</a></h3>
        <div>
            {# 假设 item.Content 是原始 Markdown 内容 #}
            {# 先渲染Markdown为HTML,再按50个单词截断,并标记为安全HTML #}
            {{ item.Content|render|truncatewords_html:50|safe }}
            <a href="{{item.Link}}"> [阅读更多]</a>
        </div>
    </article>
    {% empty %}
    <p>暂时没有文章。</p>
    {% endfor %}
{% endarchiveList %}

In this code block:

  • item.ContentThe variable gets the article content in Markdown format.
  • |renderThe filter is responsible for converting it to standard HTML string.
  • |truncatewords_html:50The filter then performs operations on this HTML string, truncating it to about 50 words while ensuring the correct closure of HTML tags.
  • |safeThe filter ensures that the final HTML fragment is parsed directly by the browser instead of being displayed as plain text.

Other truncation options:truncatechars_html

If you prefer to control the summary length by character count rather than word count, AnQiCMS also providestruncatechars_htmlthe filter. Its usage is similar totruncatewords_htmlSimilar, but the unit of truncation is characters. Similarly, it needs to be used first beforerenderMarkdown content is added and finally|safefilter.

{% archiveList archives with type="page" limit="10" %}
    {% for item in archives %}
    <article>
        <h3><a href="{{item.Link}}">{{item.Title}}</a></h3>
        <div>
            {# 先渲染Markdown为HTML,再按150个字符截断,并标记为安全HTML #}
            {{ item.Content|render|truncatechars_html:150|safe }}
            <a href="{{item.Link}}"> [阅读更多]</a>
        </div>
    </article>
    {% empty %}
    <p>暂时没有文章。</p>
    {% endfor %}
{% endarchiveList %}

**Practice and Consideration

  • Truncation length selection:A common summary length is 50 words or 150 characters, but **the length depends on your website design and content type.It needs to be tested to find the length that can attract users to click 'Read More'.
  • “Read more” link:Add a clear 'Read More' link after truncating the content, which can guide users to access the full article and improve the user experience.
  • SEO Impact: The summary should be sufficiently rich to convey the core theme of the article. Although search engines will crawl the entire page, a good summary has a positive impact on user stay time and click-through rate.
  • Performance:Although AnQiCMS is based on Go language, it has excellent performance, but excessive or complex template processing may slightly increase rendering time.Choose the appropriate truncation length and filter to meet display requirements while maintaining page loading speed.

By flexible applicationrender/truncatewords_htmlandsafeThese powerful AnQiCMS template filters allow you to easily manage and display long HTML content rendered by Markdown, while ensuring the beauty of the page and the user experience, also maintaining the integrity and security of the HTML structure.


Frequently Asked Questions (FAQ)

Q1:truncatewords_htmlandtruncatechars_htmlWhat are the differences between these filters? Which one should I choose?

A1: truncatewords_htmlIt is truncated by word count, for exampletruncatewords_html:50It extracts about 50 words. This method is more in line with human reading habits, usually maintaining the semantic integrity of the summary. Whiletruncatechars_htmlit is truncated by character count, for exampletruncatechars_html:150It will cut about 150 characters. Both will intelligently handle HTML tags to ensure that the truncated HTML structure is valid.Choose which one depends on your specific needs: If you value the natural language expression of the summary, choosetruncatewords_htmlIf you need to strictly control the visual length of the summary (for example, to fit a fixed width layout), thentruncatechars_htmlmay be more suitable.

Q2: Why was it usedtruncatewords_htmlortruncatechars_htmlAfter the filter, you still need to add|safeFilter?

A2:AnQiCMS's template engine, to prevent cross-site scripting attacks (XSS), defaults to escaping all output variable content. This means that iftruncatewords_htmlortruncatechars_htmlIt returns HTML code (including<p>/<strong>etc. tags), the template engine will process them<to&lt;,>to&gt;, causing the page to directly display the HTML tag text instead of the parsed effect.|safeThe filter explicitly tells the template engine that this part of the content is "safe" HTML that does not need to be escaped and can be output directly, ensuring that the browser can correctly render the truncated HTML.

Q3: Are there any recommended standards or factors to consider when setting the truncation length?

A3:There is no absolute standard for truncation length, it should be determined based on your website design, content type, and target user experience. Generally speaking, for the abstracts on article list pages, 5