AnQiCMS multi-site mode, the things about URL escaping

Understanding the URL construction mechanism of AnQiCMS

In AnQiCMS, the website URL structure is highly flexible.By using the "SEO-friendly rules" feature, we can customize friendly URL formats for articles, products, categories, single pages, and other content./article-123.html),也可以选择基于别名(token/filename/catname)的命名模式(如English/news/latest-updates.html)。These rules are carefully configured in the background to ensure that the URL is both beautiful and in line with SEO**practices.

In particular, under the multi-site mode, each site can have independent pseudo-static rules and content aliases.When creating a new site, a "site root directory" and "site address" are specified for it, which means each site has its independent content organization and URL namespace.{{item.Link}}This template tag retrieves the document link, which usually intelligently generates a URL that has been processed for special characters and can be used directly, based on the current pseudo-static rules and content settings.This is the convenience provided by the system for us, which greatly reduces the threshold for daily content publishing.

The necessity of URL parameter escaping

Although AnQiCMS does well in generating internal links, we still need to manually pay attention to the escaping of URL parameters in some specific scenarios.URL is used to locate network resources, and it has strict regulations for characters.&(Used to separate parameters),=(Used for assignment),?(Used to identify the start of the query string),/The path separator and spaces, Chinese characters, etc., non-ASCII characters, if they appear directly in the URL path or parameter values without being escaped, may lead to the following problems:

  1. URL parsing errorThe browser or server may not be able to correctly identify the structure of the URL, resulting in the page being inaccessible or loading incorrect resources.
  2. Data loss or error:The special characters in the parameter value are incorrectly parsed, the data passed to the backend is incomplete or tampered with. For example, if the search keyword 'product & service' is not escaped,&The symbol may be mistakenly considered as a delimiter for the next parameter.
  3. Safety hazard:User input that is not escaped and directly concatenated into a URL and processed by the backend may trigger XSS (Cross-Site Scripting) or other injection attacks.Although AnQiCMS has done many security protections internally, actively understanding and preventing is always safer.
  4. SEO and user experience:A chaotic or incorrect URL is not only不利于搜索引擎抓取和理解页面内容,but also makes users confused and affects the trustworthiness of the website.

In a multi-site environment, if we need to pass complex query parameters between different sites, such as jumping from a main site's product list to a sub-site's filtered result page, manually constructing the URL and correctly escaping the parameters is particularly critical.

AnQiCMS中的URL参数转义实践

AnQiCMS为我们提供了方便的模板过滤器来处理URL参数转义,主要包括urlencodeandiriencode.

  1. urlencodeFilter: This is the most commonly used URL parameter escaping tool, which will convert almost all non-alphanumeric characters in the URL to%xxIn the form of hexadecimal. This ensures that the URL is not misinterpreted during transmission and is the "safe preference" when passing query parameter values.

    Use CasesWhen you need to manually construct a URL, especially when adding query parameters that contain special characters (such as spaces, Chinese,&etc.).Example:Assuming you want to create a search link, the search keyword is安企CMS 解决方案:

    {% set keyword = "安企CMS 解决方案" %}
    <a href="/search?q={{ keyword|urlencode }}">搜索 {{ keyword }}</a>
    

    Here,安企CMS 解决方案will be escaped as%E5%AE%89%E4%BC%81CMS%20%E8%A7%A3%E5%86%B3%E6%96%B9%E6%A1%88,Make sure the URL is correct.

  2. iriencodeFilter:iriencodeFiltering is relativeurlencodeWherein, the character range for encoding is smaller. It is mainly used for encoding Internationalized Resource Identifiers (IRI), which retains some characters friendly to human readability (such as//:/()auto, unless they are part of the parameter value), but will still escape spaces and non-ASCII characters. In some cases, it may generate more thanurlencodeA “more attractive” URL, but slightly less secureurlencodebecause it allows more characters to “exist” unchanged.

    Use Cases:If you have higher requirements for the aesthetics of the URL and are determined to retain certain non-standard URL encoding characters (such as:),or mainly deals with URL paths that contain non-ASCII characters but have a relatively fixed structure.Example:If you have a custom URL path that may contain Chinese, but you want the path separator to/remain unchanged:

    {% set path_segment = "我的文章分类" %}
    <a href="/category/{{ path_segment|iriencode }}/page-1.html">进入分类</a>
    

    Here,我的文章分类to be escaped, but/to remain unchanged.

Special considerations under multi-site mode:

  • Cross-site linksWhen you need to build a URL pointing to site B from a template on site A and the URL contains dynamic parameters, be sure to useurlencodeEscape the parameter value. For example, if site A has a recommended article module linked to the corresponding article on site B, and needs to pass arefparameter to record the source.

    {% system siteBUrl with name="SiteBBaseUrl" %} {# 假设后台配置了站点B的BaseUrl #}
    {% set articleId = "100" %}
    <a href="{{ siteBUrl }}/article-{{ articleId }}.html?ref={{ currentSiteName|urlencode }}">查看相关文章</a>
    

    Here are thecurrentSiteName(The current site name, which may contain Chinese characters or spaces) needs to be escaped.

  • Variables in custom rewrite rules: When setting custom rewrite rules in the background, for examplearchive===/{module}-{filename}.html,{filename}Characters are usually handled automatically.But if you are manually piecing together these variables to construct a URL in a template, and the content of these variables comes from user input or may contain special characters, it is wise to escape the variables before using them.{{ item.CustomFileName }},and use it in the URL{filename}If part, then you should consider{{ item.CustomFileName|urlencode }}.

  • Data collection and importEnglish: AnQiCMS supports content collection and batch import.When handling external URLs or extracting URLs from imported content, also pay attention to whether their encoding is correct.If the URL itself has encoding issues, it may cause the page to be inaccessible or redirection to fail.