How to avoid Chinese garbled or incomplete truncation caused by the `slice` operation in AnQiCMS template?

Calendar 👁️ 76

When using AnQiCMS for website content development, you may encounter the use of filters in templatessliceThe filter fails to handle Chinese content properly, resulting in incomplete content truncation, which undoubtedly affects user experience and the accuracy of content presentation.Although the underlying design of the system is intended to avoid garbled characters, it is particularly important to understand its working mechanism due to the characteristics of Chinese multi-byte characters and the expected display effects.

AnQiCMS is a content management system developed based on the Go language, its template engine has good support for UTF-8 encoding, and explicitly requires that template files be unified in UTF-8 encoding.This is the basis for processing Chinese content and also the key to avoiding character garbling.sliceThe underlying logic of the filter is based on Go languagerune(i.e., Unicode code point, which can be understood as a single character) rather than bytes for operations. This means that when you usesliceWhen extracting Chinese content, it ensures the integrity of each Chinese character, so that no Chinese character will be cut in half, causing real garbled characters.

For example, if you have a Chinese string “Hello World”, and use{{ "你好世界"|slice:"1:3" }}to truncate, you will get “Good World”. Here is thesliceThe operation is performed according to the number of characters, starting from index 1 (the second character “好”) and ending before index 3 (the fourth character “界”).sliceThe filter itself is able to correctly identify and extract characters when processing UTF-8 encoded Chinese, and will not produce any "garbled" encoding.

However, why do you still feel that it is 'incomplete'? This is often due to the fact thatsliceThere is a deviation between the expected and actual behavior requirements.In front-end display scenarios, we may be more accustomed to cutting according to visual width (for example, a Chinese character usually occupies the width of two English characters visually), or automatically adding an ellipsis after cutting.sliceThe filter provides pure character count truncation, does not involve visual width adaptation, and will not automatically add ellipses.This difference may cause the content truncated to look 'incomplete' or 'too short'.

To avoid this misunderstanding and enhance the display effect of Chinese content, we recommend that you flexibly choose the appropriate filter when processing Chinese truncation in the AnQiCMS template:

First, make sure that all your AnQiCMS template files are saved in UTF-8 encoding format. This is the foundation for all character processing. Indesign-convention.mdZhong also emphasized this point:

Secondly, when you indeed need to cut the content strictly according to the number of characters (rune),slicethe filter is completely applicable. For example, if you want to get the first 10 characters of the content:{{ archive.Title|slice:":10" }}If your need is for front-end display, and you want to have an ellipsis prompt after truncation, thentruncatecharsortruncatechars_htmlThe filter will be a better choice.

truncatecharsFilterThis filter can intelligently handle multibyte characters and automatically add an ellipsis after truncating to a specified number of characters....It ensures the integrity of characters while also being more in line with common needs for text content preview. For example:{{ archive.Description|truncatechars:30 }}It will truncate the first 30 characters of the description and add an ellipsis as needed. For content containing HTML tags, you can usetruncatechars_htmlIt will try to maintain the integrity of the HTML structure when extracting.
lengthFilterUse it before extracting,lengthThe filter can obtain the actual character count of the content. This helps you to more accurately control the truncation length in backend logic or frontend judgment. For example:{% if archive.Title|length > 20 %}...{% endif %}.
make_listFilter: If you need to perform a more detailed character-level traversal or processing of Chinese content (for example, you need to make independent judgments or modifications for each character),make_listThe filter can split a string intoruneArrays, convenient for iteration operations.

In summary, AnQiCMS'ssliceThe filter processes Chinese text by using Unicode characters (rune) instead of bytes, which ensures the integrity of the encoding of Chinese content at the bottom level and avoids real garbled text. When you process Chinese content truncation in the AnQiCMS template, please choose flexibly according to your actual needsslice(For precise character number truncation),truncatecharsortruncatewords(For preview on the front end and automatically adding ellipses) and other filters to ensure the correct display and a good user experience.The key is to understand the mechanism of different filter functions and choose the tool that best suits your business scenario.

Frequently Asked Questions (FAQ)

Question: Why does my Chinese title usesliceAfter truncation, although there is no garbled code, it still looks incomplete?Answer: This question may be due to the fact that thesliceMisunderstanding of behavior. In AnQiCMSsliceIt is cut according to the number of Unicode characters (rune), one Chinese character counts as one.rune. But if the display space reserved for the extracted content in the template is calculated based on the width of English characters (usually one Chinese character occupies two English character widths), then evenruneThe quantity is correct, and it may appear too short or incomplete visually. In this case, it is recommended to usetruncatecharsA filter that not only truncates characters but also intelligently adds ellipses, more in line with common text preview needs.
Ask: Can AnQiCMS template directly cut Chinese content by byte number?Answer: AnQiCMS template filter (such asslice/truncatecharsAt the design stage, it took into account the integrity of Unicode characters, so it is default to be according torune(Characters) rather than bytes when performing operations to avoid garbled Chinese content due to byte truncation.The built-in filter does not provide direct functionality for truncating by byte count.We do not recommend doing this, as truncating multi-byte characters (such as Chinese) can easily lead to incomplete content and display garbled characters.If you indeed have such a special requirement, you may need to consider using custom template functions or performing byte-level processing on the backend in Go before the content enters the template, but this should be done carefully to avoid potential problems.
Ask: My AnQiCMS template files are all saved in UTF-8, but occasionally some strange symbols still appear. What's the matter?Answer: Even if the template file is UTF-8 encoded, if the content source itself (such as reading from a database, user input, or third-party collection) contains non-UTF-8 encoded characters, or the character sequence does not conform to UTF

How to avoid Chinese garbled or incomplete truncation caused by the `slice` operation in AnQiCMS template?

Related articles

Does Go template's `slice` filter support negative indices? How to use it to slice in reverse order?

In AnQiCMS template, how to implement `slice` to extract a segment of content from a string?

How does the `slice` filter handle string or array index out of bounds situations?

How to use `slice` to get the last N elements of an array in AnQiCMS template?

How to assign the result of a `slice` filter to a new variable for subsequent use?

Batch processing content: How to use `slice` to dynamically truncate strings of different lengths in a loop?

AnQiCMS development: Application scenarios of the `slice` filter in handling list pagination or displaying summaries?

Template optimization: What is the difference between the `slice` filter and the `truncatechars` filter? When should you choose which one?