When using Anqin CMS for content operations, we often need to refer to external links or email addresses in articles. To ensure that these links can be clicked by users while also ensuring the cleanliness and security of the page, Anqin CMS has built many practical template filters, among whichurlizeThe filter is a good helper for handling such needs.
Many friends are using iturlizeand may wonder, when the URL contains some special characters, such as&(ampersand) or=(Equal sign) These are commonly used for parameter separation, can it be correctly encoded to avoid link failure or page rendering errors? Today we will delve into this in detail.urlizeThe filter performs in this aspect.
urlizeThe basic function of the filter
first, urlizeThe core task of the filter is very clear: it will intelligently scan a piece of text content, automatically identify the strings that match the URL or email address format, and then convert them into clickable HTML<a>Label. This is very convenient for handling user comments, messages, or any scenario involving plain text links. For example, if you wrote it in an article欢迎访问我们的官网:https://en.anqicms.comAfterurlizeAfter processing, it will become<a href="https://en.anqicms.com" rel="nofollow">https://en.anqicms.com</a>. By default, Anqie CMS will also add attributes to these automatically generated links plusrel="nofollow"attributes, which is a good default setting for SEO friendliness.
Encoding processing for special characters:&and=Can it be encoded correctly?
Return to the special character issue we are concerned about:&and=These characters have special meanings in the HTML environment.&Is the start symbol of HTML entities (such as&), and=Used to separate keys and values in HTML attributes. If a URL is directly included in plain text&or=without proper encoding<a>label'shrefProperties, it may cause the link to break or be incorrectly parsed by the browser.
Then,urlizeHow does the filter handle it?
After actual testing and document review, Anqi CMS'surlizeThe filter performs quite reliably in this regard. It ensures that the generated<a>label'shrefspecial characters in the attribute are correctly URL-encoded. This means that, like=It will remain unchanged (because it is legal in URL query parameters, but it will be encoded when needed, such as when it is part of a parameter value), and"(Quotation marks) such characters will be converted to%22.
For example, if we have a text URL that contains special characters:www.anqicms.com/search?q=安企CMS&category="CMS".
When you use{{ "www.anqicms.com/search?q=安企CMS&category=\"CMS\""|urlize|safe }}it generateshrefProperties will be like this:
<a href="http://www.anqicms.com/search?q=%E5%AE%89%E4%BC%81CMS&category=%22CMS%22" rel="nofollow">www.anqicms.com/search?q=安企CMS&category="CMS"</a>
You can see that Chinese characters in the URL安企CMSare encoded into%E5%AE%89%E4%BC%81CMSwhile"double quotes are encoded into%22This means, even if the URL contains complex query parameters and special symbols,urlizeThe filter is also responsible for converting them into URL-compliant encoding, ensuring the validity of the link.
urlizeThe subtle aspects of parameters:truewithfalse
urlizeThe filter also provides an optional parameter to control the escaping behavior of the displayed link text. You canurlizeaddtrueorfalse.
urlize:true: When this parameter is set totrue, not onlyhrefthe property will be URL-encoded,Link Display TextContains HTML special characters such as&/</>) will also be encoded as HTML entities.{% filter urlize:true|safe %} 我的URL是 www.anqicms.com/test="test"&key=value {% endfilter %}In the output result of
href:http://www.anqicms.com/test=%22test%22&key=valueIn the output result ofDisplay text:www.anqicms.com/test="test"&key=value(Note)"Became",&Became&)urlize:false: This is the default behavior, or when you explicitly setfalsethen,hrefProperties will still be URL-encoded, butLink Display Textthe original special characters will be retained (they will not be encoded as HTML entities).{% filter urlize:false|safe %} 我的URL是 www.anqicms.com/test="test"&key=value {% endfilter %}In the output result of
href:http://www.anqicms.com/test=%22test%22&key=valueIn the output result ofDisplay text:www.anqicms.com/test="test"&key=value(Note)"and&keep the same)
This subtle difference mainly affects the visual presentation of links on the page. Usually, in order to prevent unexpected page rendering, it is usedurlize:trueIt is a safer choice because it ensures that the displayed text is also encoded with HTML entities. But if you want the displayed text of your link to remain original"or&Characters and if you have absolute confidence in the content security, thenurlize:falseIt may be more in line with your design requirements in certain specific scenarios.
No matter the choicetrueOrfalse, the core point that does not change is:urlizeThe filter is responsible forhrefProperly encoding URL special characters in the attribute to ensure the correctness of the link.
Summary
Of Security CMSurlizeThe filter is a powerful and intelligent tool that can effectively convert URLs and email addresses in plain text into fully functional HTML links. As for the common special characters in URLs,&and=Commas and other characters that may cause issuesurlizewill be encoded correctly when generating linkshrefattributes to ensure the availability of the link and the stability of the page. At the same time, bytrueandfalseParameters, you can also flexibly control the HTML entity encoding of the displayed text of the link to meet different content display needs.
While usingurlizeDon't forget to add it when filtering.|safebecauseurlizeIt will generate HTML code,|safeTell the template engine that this HTML is safe and does not require further escaping.
Frequently Asked Questions (FAQ)
urlizeandurlencodeWhat are the differences between filters?urlizeFilters are mainly used in atextAutomatically identify and convert URLs and email addresses to clickable HTML<a>tag. Its task isfind and wraplinks, while also handling the URL encoding of the links themselves.urlencodeA filter is a lower-level tool that is used toprocess a single stringPerform URL percent-encoding, usually used when constructing query parameters or fragments of URL paths to ensure that these strings