In website operation, handling URLs is a key factor in improving user experience and search engine optimization (SEO).Especially when it is necessary to include Chinese or other non-ASCII characters in the URL, how to perform proper escaping becomes particularly important.For AnQiCMS users, understanding the recommended escaping methods can help us build more stable and user-friendly websites.

AnQiCMS的URL处理哲学

The AnQi CMS, a corporate-level content management system developed based on the Go language, has always paid great attention to the cleanliness of URLs and SEO friendliness from the beginning of its design.The system is committed to making the link structure of the website clearer, easier to understand and crawl by providing features such as pseudo-static configuration and custom URL aliases.Under such philosophy, AnQiCMS has different suggestions and mechanisms for handling different parts of the URL - paths and query parameters.

路径中的中文:推荐采用ASCII字符或自动拼音转换

Specifically, when you create or edit articles, products, categories, tags, or single pages through the backend, if the title or name contains Chinese, AnQiCMS will usually automatically convert it to pinyin as the URL alias for the content (i.e., the "custom URL/anqicms-jiaocheng.htmlThis automatic conversion is a very useful feature provided by the system to ensure URL availability.

Although the system allows users to manually modify these custom URL aliases, in actual operation, we strongly recommend following the principle of using English letters, numbers, and underscores (or hyphens).This not only maintains the conciseness and beauty of the URL, but also conforms to the preference of most search engines for high-quality URLs.If it is necessary to use Chinese, also make sure that the system has correctly converted it to pinyin or other ASCII forms.

Chinese in query parameters: useurlencodePerform standard escaping

When the URL path is different, when we need to pass Chinese or other non-ASCII characters in the URL query parameters (Query String), we need to perform appropriate URL escaping. Query parameters are usually located at the?after that, inkey=valuethe form, used to pass additional data to the server, such as search keywords, filtering conditions, and so on.

The AnQi CMS provides a series of built-in template filters (filters) to help us handle such situations, among which the most commonly used and recommended isurlencodeFilter.urlencodeThe string will be percent-encoded according to the URL encoding specification, converting special characters (including Chinese characters and spaces, etc.) into percent-encoding form. For example, if you have a query parameterq=安企CMSafterurlencodeFilter processed, it will becomeq=%E5%AE%89%E4%BC%81CMS. This encoding method is a web standard and is widely supported by all browsers and servers.

You can use it like this in AnQiCMS templatesurlencodeA filter to ensure the correct escaping of query parameters:

{# 假设有一个名为searchKeyword的变量,其值为“安企CMS” #}
<a href="/search?q={{ searchKeyword|urlencode }}">搜索安企CMS相关内容</a>

{# 如果您需要将整个URL进行编码(不常见,但某些API可能要求) #}
{% set rawUrl = "https://example.com/api?param=中文测试" %}
<a href="{{ rawUrl|urlencode }}">API调用链接</a>

AnQiCMS also provides another filter namediriencodewhich also escapes URL parameters.iriencodeThe design intention is to make the International Resource Identifier (IRI) more visually readable, so it retains more characters that are considered 'safe' non-ASCII characters, and it does not likeurlencodeThoroughly perform percent encoding. But in actual web applications, to ensure maximum compatibility with different browsers, servers, and proxies,urlencodeIt is usually a safer and more recommended choice, especially when you need to send parameters to third-party services.

Why is URL escaping so important?

The fundamental reason for URL escaping lies in the limitations of URL specifications.The original URL specification (RFC 1738) allows only a small subset of ASCII characters.&/?The content directly appearing in the URL may cause the following problems:

  1. Parsing error:The browser or server may not be able to correctly identify and parse the URL.
  2. Data loss or damage:Character encoding inconsistency may cause the transmitted data to become garbled or lost.
  3. Security issue:Special characters that are not escaped may be maliciously exploited to trigger cross-site scripting (XSS) and other security vulnerabilities. ThroughurlencodePerforming standard escaping, we can ensure that every part of the URL conforms to the specification, thus avoiding the aforementioned issues and ensuring the stability and security of the website.

Summary

When using AnQiCMS for website content management, handling URLs that contain Chinese or other non-ASCII characters should follow the following **practices:**

  • For content in the URL path (such as aliases):优先使用系统自动生成的拼音URL,或手动输入纯ASCII字符(英文字母、数字、下划线/短横线)。This helps improve SEO performance and compatibility.
  • For content in the URL query parameters:Recommended to useurlencodeThe filter performs standard percent-encoding on parameter values to ensure the correctness and wide compatibility of data transmission.

Follow these principles, your AnQiCMS website will have a stronger, more user-friendly URL structure.


Common Questions (FAQ)

Q1: Why does AnQiCMS not directly allow Chinese characters in URL paths?

A1: This design of AnQiCMS is mainly for considering URL compatibility and SEO optimization.Using Chinese characters directly in the URL path is recognized by some browsers and systems, but it may cause encoding issues in different environments (such as some old browsers, search engine crawlers, or third-party tools), leading to garbled text or inaccessible content.At the same time, URLs using pure ASCII characters are typically shorter and clearer, and are considered best SEO practices, which help improve a website's performance in search engines.The system automatically generates pinyin or suggests using ASCII characters to help users avoid these potential issues.

Q2:urlencodeandiriencodeWhat are the differences, and which one should I choose?

A2:urlencodeIt performs the most widely used and standard URL percent-encoding, which converts all non-ASCII characters and special characters (such as spaces,&/?etc.) into percent-encoded form.iriencodeit will retain more non-ASCII characters (such as//#/%/()Englishurlencode.iriencodeMore suitable for those who need to process URLs