How does the `truncatechars_html` filter safely truncate HTML content without breaking the tag structure?

Calendar 👁️ 71

In website operation, we often need to display an abstract of a large amount of content on a page, such as the article list on the homepage, a brief introduction on the product detail page, or recommended content for a module.These summaries must be able to attract readers to click and maintain the neat and beautiful layout of the page.However, when the content itself contains rich HTML formatting (such as bold, italic, images, links, etc.), simply truncating the character length often leads to a headache: the HTML tag structure is destroyed, causing the page display to be chaotic and even affecting the overall style.

Imagine an article with careful formatting, but the abstract leaves an unclosed one due to improper cutting<div>Or a picture tag that only shows half of it<img src="..." alt="...This result not only greatly affects the user experience, making the page look disorganized, but may also have a negative impact on search engine optimization (SEO), because search engines tend to grab pages with good structure and standard code.

The AnQi CMS template system, which draws on the flexible syntax of Django, includes a feature namedtruncatechars_htmlThe practical filter, which is exactly for solving the above difficulties.This filter can intelligently extract content containing HTML tags while ensuring that the extracted HTML code is complete and valid, without damaging the original tag structure.

truncatechars_htmlHow to ensure the security of HTML content extraction

truncatechars_htmlThe filter usage is very intuitive. You just need to pass the content variable to be extracted through the pipe symbol|,truncatechars_htmland specify the length of the 'visible characters' you want to extract.

For example, the content of your article is stored inarticle.Contenta variable, and you want to extract the first 120 visible characters as a summary:

{{ article.Content|truncatechars_html:120|safe }}

The key point is heretruncatechars_htmlThe 'smart' part. It's not just about counting 120 characters from the beginning and cutting off. Instead, it will:

  1. Identify HTML tagsIt knows which are HTML tags (for example<strong>/<a>/<p>), which are the actual text content.
  2. Calculate visible charactersIt only counts the visible text characters that the user can see, while ignoring the characters occupied by the HTML tags themselves.
  3. Safe truncationWhen the specified length is reached, if the truncation point is exactly in the middle of an HTML tag,truncatechars_htmlit will intelligently adjust the truncation point to ensure that the tag is not truncated into an incomplete segment.
  4. Self-closing tags: What's more, if any unclosed HTML tags remain after truncation (such as<div>tags that were opened but do not have a corresponding</div>It will automatically add the correct closing tag at the end, thus ensuring that the generated content fragment is a structurally complete HTML block.
  5. Add an ellipsis.By default, if the content is truncated,truncatechars_htmlan ellipsis "..." is added at the end of the truncated content to indicate that the content is incomplete.

Let's experience the magic through a simple example. Suppose you have a piece of HTML content:

<div class="foo">
  <p>这是一段很长的<b>测试文本</b>,它会被安全地截取,而不会破坏HTML结构。</p>
  <ul>
    <li>列表项1</li>
    <li>列表项2</li>
  </ul>
</div>

If you usetruncatechars_html:25To extract this content:

{{ "<div class=\"foo\"><p>这是一段很长的<b>测试文本</b>,它会被安全地截取,而不会破坏HTML结构。</p><ul><li>列表项1</li><li>列表项2</li></ul></div>"|truncatechars_html:25|safe }}

The result will be like this (for readability, it may be simplified or truncated according to the specific content and cut-off point):

<div class="foo"><p>这是一段很长的<b>测试文本</b>,它会被安全地截取,而不会破...</p></div>

As can be seen, even the original<ul>and<li>The tag may be truncated after the breakpoint, but<div>and<p>All tags have been properly closed, ensuring the integrity of the HTML structure. However, if a regulartruncatecharsfilter is used, it may be at risk of<p>within the tags or<b>The tag is truncated directly in the middle, causing HTML rendering error.

Application scenarios in practice

truncatechars_htmlIt is widely used in the daily content operation of AnQi CMS:

  • Summary of the article list pageOn the blog or news list page, display the concise content of each article, providing key information while avoiding the layout being stretched by long content.
  • Short description of the product listOn e-commerce websites, display the core selling points of products on the product list page while maintaining page loading speed and aesthetics.
  • Search results previewIn the in-site search results, provide users with fragments of relevant content to help them quickly determine if it is the information they need.
  • Recommended module contentIn the sidebar, footer recommendation modules, and other modules, display the essence of related content to attract users to click.

Through this filter, content operators can confidently use the rich text editor to create colorful content in the background, and there is no need to worry about complex truncation logic on the front end, truncatechars_htmlWill intelligently handle everything, keeping your website always professional and tidy.

Frequently Asked Questions (FAQ)

1.truncatechars_htmlWill it truncate Chinese characters? How does it calculate the length?Yes,truncatechars_htmlIt can correctly truncate Chinese characters. It calculates length based on 'characters' rather than 'bytes'.This means that a Chinese character and an English letter are both counted as 1 character, ensuring consistency and accuracy in multi-language environments.

2. Will an ellipsis be added if the visible character length of the content itself is less than the length I set?No.truncatechars_htmlExtremely intelligent, an ellipsis "..." is added at the end of the content only when the actual content length exceeds the length you set.If the original content is already quite short and does not reach the length you set, it will be output as is, without adding an ellipsis unnecessarily.

3.truncatechars_htmlandtruncatewords_htmlWhat is the difference? Which one should I choose?Both are used as filters for safely extracting HTML content, the main difference being the units they extract:

  • truncatechars_htmlPress:characterTruncate the length. It will start counting visible characters from the beginning and safely truncate after reaching the specified length.Even if the truncation point is in the middle of a word, it will keep the part before it and add an ellipsis.
  • truncatewords_htmlPress:wordsThe number of characters is truncated. It calculates the number of visible words and safely truncates after reaching the specified word count.This way ensures that the extracted content ends with a complete word.Choose which one depends on your specific needs. If you have strict character length restrictions (such as fixed-width card display),truncatechars_htmlIt may be more appropriate. If you pay more attention to the semantic integrity of the content, you hope that the summary always ends with a complete word, thentruncatewords_htmlWould be a better choice.

Related articles

How to truncate a long string and automatically add an ellipsis (...)?

In website operation, we often encounter situations where we need to display a piece of text, but we cannot let it be too long to avoid affecting the page layout or reading experience.Whether it is the title, abstract, or product description of an article, if the content exceeds the expected length, the usual practice is to truncate a part of it and add an ellipsis at the end to indicate that the content has not yet ended.For AnQiCMS users, achieving such an effect is not complicated, thanks to its flexible and powerful template engine, we have a variety of built-in filters (Filters) that can easily handle it

2025-11-08

What are the similarities and differences between the `stampToDate` and `date` filters in handling time formatting and their applicable scenarios?

In Anqi CMS template development, we often need to display time data in a user-friendly format.The system provides two very practical tools for handling time: the `stampToDate` function and the `date` filter.Although they can all help us format time, there are some key similarities and differences as well as applicable scenarios, understanding these can make our template development more efficient and accurate.## `stampToDate`: A timestamp handler In AnQi CMS

2025-11-08

How to format a Unix timestamp into a readable date and time string?

In website content management, the way time is presented is crucial to user experience.Although the system may prefer a unified and efficient Unix timestamp format for background data processing, for visitors, a string of random numbers is obviously not as intuitive and easy to understand as AnQi CMS knows this and provides a simple and powerful tool to solve this problem.### Unix timestamp: The 'time language' in the database Unix timestamp, in short

2025-11-08

Can the `divisibleby` filter be used to implement alternating row colors or other conditional styles in a loop?

In the daily operation of website content, how to make the list data more readable and visually attractive is a key factor in improving user experience.AnQiCMS (AnQiCMS) offers a flexible template engine, providing rich possibilities for content display.Today, let's talk about a very practical template filter——`divisibleby`, and see how it helps us achieve alternate row coloring or other conditional styles in loops.## Get to know the `divisibleby` filter AnQi CMS template system

2025-11-08

How to convert the first letter or the first letter of each word in an English string to uppercase in AnQiCMS?

In daily website content management, we often need to finely control the display format of English strings, such as capitalizing the first letter of the article title or making each word of the product name start with uppercase to enhance the professionalism and unity of the content.AnQiCMS (AnQiCMS) fully understands the importance of these subtle details to the website's image, and therefore provides convenient and powerful string processing functions in template design, allowing you to easily meet these formatting needs.The Anqi CMS uses a template engine syntax similar to Django

2025-11-08

What are the limitations of the `lower` and `upper` filters when dealing with case conversion (such as Chinese)?

In AnQiCMS template development, the `lower` and `upper` filters are commonly used tools for handling text case conversion.They are designed to help us quickly standardize the display of text, such as converting the irregular content entered by users to lowercase or uppercase to maintain consistent page style or meet certain data processing requirements.However, when using these convenient filters, we may encounter some "edge" cases that they cannot handle, especially when it comes to non-English characters, such as Chinese.### `lower` and `upper`

2025-11-08

How to ensure that single quotes, double quotes, and backslashes are correctly escaped in HTML output?

During website operation and template creation, we often need to output dynamic content to the HTML page.This is a common but often overlooked question: How to ensure that special characters such as single quotes, double quotes, and backslashes in the content do not破坏 the page structure or cause security issues when output to HTML?Don't worry, AnQiCMS provides very friendly built-in mechanisms and flexible tools in this aspect, which help us handle it easily.### AnQiCMS's default security mechanism: automatic escaping AnQiCMS has taken full consideration of content security in its design

2025-11-08

What is the use of the `addslashes` filter in JavaScript or JSON data output?

In website content management, especially when we want to insert dynamic data into JavaScript code or construct JSON formatted output, handling special characters is a non-negligible aspect.The AnQiCMS template engine provides a rich set of filters to help us elegantly handle such issues, with the `addslashes` filter being a practical tool specifically designed for this kind of scenario.The purpose of the `addslashes` filter explained

2025-11-08