In website operation, handling URLs is a key factor in improving user experience and search engine optimization (SEO).It is particularly important to properly escape when a URL needs to include Chinese or other non-ASCII characters.For AnQiCMS (AnQiCMS) users, understanding the system's recommended escaping methods can help us build more stable and user-friendly websites.
The URL processing philosophy of AnQiCMS
AnQi CMS is an enterprise-level content management system developed based on the Go language, which has always paid great attention to the tidiness of URLs and SEO-friendliness in the initial design.The system is committed to making the link structure of the website clearer, easier to understand and crawl by providing pseudo-static configuration, custom URL alias and other functions.Under this philosophy, AnQiCMS has different processing suggestions and mechanisms for different parts of the URL - paths and query parameters.
Chinese characters in the path: it is recommended to use ASCII characters or automatic pinyin conversion
In the URL path, for example, the URL alias of the article detail page, the URL alias of the category page, or a custom single page address, AnQiCMS tends to use pure ASCII characters.This ensures maximum compatibility of the URL globally, avoids problems caused by inconsistent encoding parsing among different systems or browsers, and also has a positive impact on search engine crawling and ranking.
In particular, when you create or edit articles, products, categories, tags, or single pages through the back-end, if the title or name contains Chinese, AnQiCMS will usually automatically convert it into pinyin as the URL alias of the content (i.e., the 'custom URL' field in the document). For example, an article named 'AnQi CMS tutorial' may automatically generate a URL alias such as/anqicms-jiaocheng.htmlThis automatic conversion is a very practical feature provided by the system to ensure URL availability.
Although the system allows users to manually modify these custom URL aliases, in practice, we strongly recommend following the principle of using letters, numbers, and underscores (or hyphens).This not only maintains the conciseness and beauty of the URL, but also meets the preference of most search engines for high-quality URLs.If you must use Chinese, also make sure that the system has correctly converted it to pinyin or other ASCII forms.
Use Chinese in query parameters:urlencodePerform standard escaping.
When the URL path is different, when we need to pass Chinese or other non-ASCII characters in the URL query parameters (Query String), we need to perform appropriate URL escaping. Query parameters are usually located at the?After, in order tokey=valueExists in the form of, used to pass additional data to the server, such as search keywords, filtering conditions, etc.
AnQi CMS provides a series of built-in template filters (filters) to help us handle such situations, among which the most commonly used and recommended isurlencodefilter.urlencodeThe string will be encoded according to the URL encoding specification, converting special characters (including Chinese characters, spaces, etc.) into percent-encoded (percent-encoding) form. For example, if you have a query parameterq=安企CMSAfterurlencodeAfter the filter is processed, it will becomeq=%E5%AE%89%E4%BC%81CMSThis encoding method is a Web standard, widely supported by all browsers and servers.
You can use it like this in the AnQiCMS templateurlencodeA filter to ensure the correct escaping of query parameters:
{# 假设有一个名为searchKeyword的变量,其值为“安企CMS” #}
<a href="/search?q={{ searchKeyword|urlencode }}">搜索安企CMS相关内容</a>
{# 如果您需要将整个URL进行编码(不常见,但某些API可能要求) #}
{% set rawUrl = "https://example.com/api?param=中文测试" %}
<a href="{{ rawUrl|urlencode }}">API调用链接</a>
AnQiCMS also provides another namediriencodefilter that will also escape URL parameters.iriencodeThe design intention is to make the International Resource Identifier (IRI) more visually readable, so it retains more non-ASCII characters that are considered "safe", and it does not look likeurlencodePercent encoding is done so comprehensively. But in actual web applications, in order to ensure maximum compatibility with different browsers, servers, and proxies, urlencodeIt is usually a safer and more recommended choice, especially when you need to send parameters to third-party services.
Why is URL escaping so important?
The fundamental reason for URL escaping is due to the limitations of URL specifications.The original URL specification (RFC 1738) allows only a small part of ASCII characters.Non-ASCII characters (such as Chinese) or special characters (such as spaces,&/?May cause the following problems when appearing directly in the URL:
- Parsing error:The browser or server may not be able to correctly identify and parse the URL.
- Data loss or corruption:Inconsistent character encoding may cause the transmitted data to become garbled or lost.
- Security issues:Special characters that are not escaped may be exploited maliciously, triggering cross-site scripting (XSS) and other security vulnerabilities.
Through
urlencodePerform standard escaping to ensure that every part of the URL conforms to the specification, thereby avoiding the aforementioned problems and ensuring the stability and security of the website.
Summary
When using AnQiCMS for website content operation, the following practices should be followed when dealing with URLs containing Chinese or other non-ASCII characters:**
- For content in the URL path (such as aliases):Use the system-generated pinyin URL preferentially, or manually input pure ASCII characters (letters, numbers, underscores/hyphens).This helps improve SEO performance and compatibility.
- For content in the URL query parameters:Recommended to use
urlencodeThe filter performs standard percent-encoding on parameter values to ensure correct data transmission and wide compatibility.
Follow these principles, and your AnQiCMS website will have a stronger and more user-friendly URL structure.
Frequently Asked Questions (FAQ)
Q1: Why does AnQiCMS not allow Chinese characters in URL paths directly?
A1: This design of AnQiCMS is mainly for URL compatibility and SEO optimization.Using Chinese characters directly in the URL path is recognized by some browsers and systems, but it may encounter encoding issues in different environments (such as some old browsers, search engine crawlers, or third-party tools), which may cause garbled characters or inaccessible.At the same time, URLs with pure ASCII characters are usually shorter and clearer, and are considered best SEO practices, which help improve the performance of websites in search engines.The system automatically generates pinyin or suggests using ASCII characters, just to help users avoid these potential problems.
Q2:urlencodeandiriencodeWhat are the differences, which one should I choose?
A2:urlencodeIt is the most widely used and standard URL percent-encoding, which will convert all non-ASCII characters and special characters (such as spaces,&/?and so on) into percent-encoded form.iriencodeIt will retain more non-ASCII characters (such as//#/%/()It makes the encoded URL more visually readable. In most web application scenarios, to ensure maximum compatibility and avoid potential problems, we recommend usingurlencode.iriencodeMore suitable for those who can