In website operation, the construction and processing of URL (Uniform Resource Locator) is a fundamental and critical task.Especially when dynamically generating links, handling user input as parameters, it is crucial to encode the URL correctly, which can ensure the validity of the link, prevent garbled characters, and potential security issues.urlencodeandiriencodeThese two filters help us better manage special characters in URLs. Although they are both used for encoding, their application scenarios and processing methods are different.

urlencodeFilter: Strict percent-encoded

First, let's understandurlencodeFilter.The main function is to perform standard URL percent-encoding (percent-encoding) on variables.- . _ ~Characters within the range, will be converted to%xx(Among whichxxis the hexadecimal ASCII value of the character).

This encoding method is very strict and comprehensive. Its goal is to ensure that all characters in the URL can be safely transmitted and parsed by network protocols, avoiding ambiguity. For example, spaces, Chinese characters, and other characters cannot be included directly in the URL.&Symbol (because it is a parameter delimiter),=Symbols (because it is the separator of key-value pairs) etc.If these characters appear in the URL without encoding, it may cause the link to break, parameter parsing errors, and even trigger security vulnerabilities.

Applicable scenarios:

  • Encode the entire URL or query string:When you need to pass a complete URL string as a parameter value for another URL (such as for redirection or tracking), or when you need to encode the entire query string to ensure its integrity,urlencodeIt is an ideal choice.
  • Encode a single query parameter value:The most common scenario is that users input Chinese characters, including spaces or special symbols, in the search box. In order to safely pass these keywords as URL parameters, you need tourlencode.
    • Example:Assuming the user searches "Anqi CMS Official Website", if it is directly entered into the URL, it may cause problems.http://example.com/search?q=安企 CMS 官网UseurlencodeAfter:http://example.com/search?q=%E5%AE%89%E4%BC%81%20CMS%20%E5%AE%98%E7%BD%91(Here,%20represents a space,%E5%AE%89etc. represent Chinese characters)
  • Ensure all unsafe characters are handled:When you are uncertain about the character set of the input content, useurlencodeCan provide the highest level of security, avoiding any unexpected characters that could cause the URL to become invalid.

In the templates of Anqi CMS, useurlencodeThe method of the filter is as follows:

{{ "http://www.example.org/foo?a=b&c=d"|urlencode }}
{# 输出: http%3A%2F%2Fwww.example.org%2Ffoo%3Fa%3Db%26c%3Dd #}

{{ "我的搜索关键词"|urlencode }}
{# 输出: %E6%88%91%E7%9A%84%E6%90%9C%E7%B4%A2%E5%85%B3%E9%94%AE%E8%AF%8D #}

iriencodeThe filter: structure-preserved internationalization encoding

iriencodeThe filter provides a relatively lenient encoding method, mainly used for processing IRIs (Internationalized Resource Identifier, internationalized resource identifier).IRI is a superset of URL, allowing more Unicode characters to be used in identifiers, supporting various languages globally.iriencodeWhen encoding, some structural special characters in the URL are retained, while only the characters that need to be encoded are processed.

According to the Anqi CMS document.iriencodeare retained./#%[]=:;$&()+,!?*@'~The original appearance of these characters, while escaping URL parameters for other characters.This means it will more intelligently recognize the structure of URLs, avoiding encoding of characters that are used as separators or have specific meanings, thus maintaining the readability and structural integrity of the URL.

Applicable scenarios:

  • Translate URL path segments:When the URL path contains Chinese or special characters, you may want to use path delimiters/to keep them unchanged, maintaining the path structure.
    • Example: http://example.com/产品分类/电子产品UseiriencodeThey may convert产品分类and电子产品Encode Chinese characters but keep/:http://example.com/%E4%BA%A7%E5%93%81%E5%88%86%E7%B1%BB/%E7%94%B5%E5%AD%90%E4%BA%A7%E5%93%81
  • Internationalized domain names or paths:If your website uses Chinese or other non-ASCII character domain names or paths (such as.公司or/新闻标题)iriencodeIt is more suitable for handling these internationalization elements, as it is designed to be compatible with a broader character set.
  • The specific URL structure characters need to be preserved:In the construction of some complex URLs, you may explicitly know that certain characters (such as:/=/&etc.) are part of the URL structure and should not be encoded.iriencodeYou can encode other unsafe characters without damaging these structural characters.
  • Scenarios for HTML entity encoding in specific environments:Although the name isiriencode, but the examples given in the document"?foo=123&bar=yes"|iriencodeOutput?foo=123&bar=yesThis indicates that in some cases, it may also perform HTML entity encoding (which will&Converted to&). If your final output is directly embedded in HTML, and you need to&HTML entities encoding rather than URL percent encoding, which may be one of its hidden features or specific behaviors. But in typical URL encoding scenarios,&It is usually not converted&. It is recommended to verify the actual output effect when using it.

in Anqi CMS