In website operation, we often encounter situations where product descriptions contain a large number of HTML tags.These tags may come from different content sources or may be accidentally introduced during editing, leading to poor display of the description in some scenarios, such as plain text email notifications, SEO summaries, or concise views on mobile devices.AnQi CMS provides a flexible way to solve this problem, whether it is to dynamically display plain text on the front end through templates, or to perform batch processing on the back end to completely remove redundant HTML tags.
Method one: Dynamically remove HTML tags in the template, retaining plain text only
If you want to keep the original content of the product description unchanged in the database, but remove the HTML tags only when displaying on the front-end page, the template filter of Anqi CMS can easily achieve this.
Anqi CMS uses the Django template engine syntax and providesstriptagsa filter that can remove all HTML tags from a segment of HTML content, leaving only plain text.
Application scenario:
- Display a product description summary on the product list page without complex formatting.
- In a specific area of the website, it is necessary to output plain text without any style.
- Do not modify the original data in the database, retain the HTML format for future use.
Operation steps:
Find the related template file:Product descriptions are usually processed through
archiveDetailorarchiveListtag, for example{{item.Description}}or{{item.Content}}. You need to locate the template file that displays the product description (usuallydetail.htmlorlist.html), the specific path may vary depending on the template you are using.Use
striptagsFilter:Add after the product description variable where you want to display plain text.|striptagsFor example, if your product description is stored initem.Descriptionyou can modify it like this:<p>{{ item.Description|striptags }}</p>If your product description is long, it can be stored in
item.Contentand may contain more complex HTML structures, the same can be applied:<div> {% archiveDetail productContent with name="Content" %} {# 移除所有HTML标签,只保留纯文本 #} {{ productContent|striptags }} {% endarchiveDetail %} </div>Here's something to note,
striptagsThe filter will remove all HTML tags, including images (<img>) and videos (<video>)Label. If you want images and videos not to display, this is exactly what you need. If you need to retain some labels, for example, to remove only scripts, you can consider usingremovetagsFilter and specify the tags to be removed.Save and view the effect:Save the modified template file, refresh the front-end page, and you will see that all HTML tags in the product description have been removed, leaving only concise plain text content.
The advantage of this method is non-destructive, it will not modify the original data in the database, and the template engine will dynamically process it every time it renders.
Method two: Use the background "Site-wide Content Replacement" feature to batch permanently remove HTML tags
If you want to completely remove HTML tags from product descriptions in the database and permanently convert the content to plain text, the 'Whole Site Content Replacement' feature of Anqi CMS combined with regular expressions can achieve batch processing.
Application scenario:
- There is a large amount of historical data in the database, and the product descriptions all contain unnecessary HTML tags.
- The subsequent content release will strictly use plain text format, hoping to unify the data standards.
- To reduce the size of the database or optimize the output of some API interfaces.
Operation steps:
Enter "Document Keyword Replacement":Log in to the Anqi CMS backend management interface, find "Content Management" in the left navigation bar, and then click "Document Keyword Replacement"。This tool is named Keyword Replacement, but it supports powerful regular expressions and can remove HTML tags in batches.
Configure the replacement rule:
- Select mode:In the replacement settings, select the "Regular Expression" mode. This is a crucial step because it allows you to use pattern matching to identify and remove HTML tags.
- Replace keyword:In the 'Replace Keyword' input box, enter the regular expression used to match all HTML tags. A commonly used, relatively safe regular expression is:
The meaning of this expression is: match all sequences ending with<[^>]+><and ends with>and do not contain>any character sequence. This can effectively capture most HTML tags. - Replace with:In the "Replace with" text box, leave it blank. This means that all matching HTML tags will be deleted and replaced with nothing, thus achieving the effect of removal.
- Select field:In the 'Replace field' section, check the fields that include product descriptions. Typically, product descriptions are stored in
ContentorDescriptionIn the field. Be sure to choose carefully to avoid mistakenly operating on other fields. - Select a model (if needed):If your website has multiple content models (such as article models, product models), and you only want to process descriptions under the product model, you can select 'Product Model' in the filter conditions.
(Picture is for illustration purposes only, please enter regular expressions manually)
Execute batch replacement (extremely important, please operate with caution):Before clicking to execute the replacement,Be sure to back up the database!This operation is irreversible. Once executed, the HTML tags in the product description will be permanently deleted from the database.After confirming that the backup is correct, click the "Execute Replacement" button.The system will traverse all product descriptions that meet the conditions and remove the HTML tags from them.
After replacement, you can randomly check several product descriptions to confirm that the HTML tags have been successfully removed.
Frequently Asked Questions (FAQ)
Q1:striptagsandremovetagsWhat are the differences between filters?
A1: striptagsThe filter will remove all HTML tags, regardless of type, as long as it is<>The wrapping tags will be removed. AndremovetagsThe filter allows you to specify the HTML tags to be removed. For example,{{ item.Content|removetags:"script,style" }}Will remove<script>and<style>Tags, but retain other HTML tags. If you need to retain some formatting but remove specific functional tags,removetagswould be a better choice. If the goal is plain text,striptagsit is more efficient and direct.
Q2: Before executing the background batch replacement operation, what else should be noted besides backing up the database? A2:Batch replacement is a high-risk operation, in addition to database backup, you also need to pay attention to the following points:
- Check the regular expression carefully:Ensure that the regular expression you enter matches HTML tags accurately without affecting other content.
<[^>]+>It is a commonly used expression, but it may have issues with unconventional or incomplete HTML.