In AnQiCMS content management practice, we sometimes encounter such needs: to extract pure text information from document content containing rich formats (such as bold, italic, images, links, etc.).This may sound contradictory, content management systems are committed to the diverse display of content, why do we still 'strip' these formats?However, in many specific scenarios, displaying only plain text content can play a crucial role, such as generating concise summaries for articles, optimizing search engine meta descriptions (Meta Description), providing unified and fresh content previews on list pages, or importing content into platforms that do not support HTML formats.

So, how can we efficiently achieve this goal in the flexible template system of AnQiCMS? AnQiCMS provides powerful filters, among whichstriptagsandremovetagsIt is the tool to solve this problem.

Get to know the core tools:striptagsFilter

striptagsThe filter is a very practical feature in AnQiCMS templates, its function is exactly as the name suggests——“Tag stripping”. When you want to remove all HTML or XML tags from a piece of content containing HTML tags all at once, leaving only the text, striptagsit can be put to use.

Its usage is very intuitive, just append the pipe symbol to the variable you want to process|连接striptagsFor example, if you have a variable namedarchive.ContentThe variable stores the article body with HTML formatting, you can retrieve its plain text content in this way:

{{ archive.Content | striptags }}

This simple code will traversearchive.ContentRemove all content<div>/<p>/<a>/<img>And HTML tags, finally only output the visible text.

Scenes for processing Markdown content

It is noteworthy that the AnQiCMS backend may have enabled a Markdown editor when editing document content. In this case,archive.ContentThe variable may contain Markdown formatted text, rather than direct HTML. If Markdown text is used directlystriptagsThe result may not be satisfactory because it cannot recognize Markdown syntax and convert it to the corresponding plain text.

At this point, we need to first userenderA filter that renders Markdown text into HTML and then uses itstriptagsRemove HTML tags.renderThe filter can correctly convert Markdown syntax into HTML structures recognizable by browsers. Therefore, the complete processing process will be as follows:

{# 假设 archive.Content 变量中存储的是 Markdown 格式的内容 #}
{{ archive.Content | render | striptags }}

ByrenderFilter, Markdown content is converted to HTML, thenstriptagsRemove these HTML tags to ensure the final output is plain text. It should be noted that,renderThe filter outputs HTML, if displayed directly on the page, in order to avoid the browser escaping HTML tags and causing HTML tags to be displayed directly, you need torenderAfter the filter is addedsafeThe filter. But withstriptagsWhen used together, due tostriptagsUltimately, it will remove all HTML, sosafeIt is not necessary, because the final result it processes is plain text.

Flexible control:removetagsFilter

Sometimes, our needs may be more refined: we do not want to remove all HTML tags, but only want to remove specific ones, while keeping other tags (such as, we want to keep<a>Label so that users can click on the link but remove all images<img>Or paragraph<p>Label). At this point,removetagsThe filter is particularly powerful.

removetagsThe filter allows you to specify one or more HTML tags to remove.You just need to provide a comma-separated list of tag names after the filter.For example, if you want to remove<i>and<span>Label, but keep all other content, and you can write it like this:

{# 移除 <i> 和 <span> 标签,保留其他所有标签 #}
{{ "<strong><i>Hello!</i><span>AnQiCMS</span></strong>" | removetags:"i,span" }}

This code will output<strong>Hello!AnQiCMS</strong>It can be seen<i>and<span>the tags have been removed, and<strong>Labels are retained. This fine control provides great flexibility in specific scenarios of content display.

Combined with the excerpt function, it generates a plain text summary.

When generating article abstracts or summaries, we need not only plain text but also control its length. AnQiCMS providestruncatecharsandtruncatewordsFilters that can automatically add ellipses while truncating strings (...) when used in conjunction withstriptagscan easily generate pure text summaries that meet the required specifications:

{# 获取纯文本内容,并截取前100个字符作为摘要 #}
<p>{{ archive.Content | render | striptags | truncatechars:100 }}</p>

{# 或者,按单词数量截取 #}
<p>{{ archive.Content | render | striptags | truncatewords:30 }}</p>

Please note,truncatecharsIt will truncate by character count (including one Chinese character), andtruncatewordsIt will truncate by word count. Choose the appropriate truncation method based on your specific needs and content characteristics.

Practical suggestions

Remove HTML tags in the AnQiCMS template and display only plain text content, mainly aroundstriptagsandremovetagsTwo filters expanded. In practical applications, you need:

  1. Determine the source of the content: Judgearchive.ContentThe content stored in the variables is pure HTML or Markdown. If it is Markdown, be sure to use it first.renderthe filter to convert.
  2. Choose the appropriate filterRemove all tags if necessarystriptagsOr remove only some tagsremovetagsChoose the most suitable filter.
  3. Consider the length of the summaryIf used to generate a summary, combinetruncatecharsortruncatewordsEnsure the output content is concise.
  4. SEO OptimizationIn<meta name="description" content="...">Used in tagsstriptagsEnsure the output is plain text, which is friendly to search engines.

These filters provided by AnQiCMS allow template designers to flexibly control the way content is presented, meeting various needs from full rich text display to concise plain text output, thereby building websites with more expressive and functional features.


Frequently Asked Questions (FAQ)

1.striptagsandremovetagsWhat are the main differences of the filter?

striptagsThe filter will remove all detected HTML and XML tags, leaving no room, and directly output plain text. AndremovetagsThe filter provides finer control, allowing you to specify one or more specific HTML tags to remove (such as<img>/<p>),while other unspecified HTML tags will be retained in the content.Choose which filter to use depends on whether you want to remove all formats completely or selectively retain some formats.

2. How to ensure the appropriate length of plain text content when generating an article summary?

After obtaining the plain text content, it can be combined withtruncatecharsortruncatewordsa filter to control the length. First, userender(If the content is Markdown) andstriptagsRemove HTML tags and then apply the truncation filter. For example,{{ archive.Content | render | striptags | truncatechars:150 }}The content will be converted to plain text, then the first 150 characters will be truncated and an ellipsis will be added.truncatewordsThen it will be truncated by word count.

3. UsestriptagsWill filtering output plain text content affect the website's search engine optimization (SEO)?

This depends on where you use plain text. In some cases, using plain text is beneficial for SEO. For example, websites<meta name="description">The tag should only contain plain text because search engines usually only crawl and display plain text descriptions.When displaying article summaries on the list page, plain text also helps search engines understand the content faster.Please note that you should not convert all the main content into plain text, as search engines also need to parse HTML structure to understand the page layout and content focus.**Practice is to use plain text in areas that require concise and unformatted display, while maintaining rich HTML formatting in the main content area.