QUrl Class
The QUrl class provides a convenient interface for working with URLs. More...
| Header: | #include <QUrl> |
| qmake: | QT += core |
Note: All functions in this class are reentrant.
Public Types
| enum | ComponentFormattingOption { PrettyDecoded, EncodeSpaces, EncodeUnicode, EncodeDelimiters, EncodeReserved, …, FullyDecoded } |
| enum | ParsingMode { TolerantMode, StrictMode, DecodedMode } |
| enum | UrlFormattingOption { None, RemoveScheme, RemovePassword, RemoveUserInfo, RemovePort, …, NormalizePathSegments } |
| enum | UserInputResolutionOption { DefaultResolution, AssumeLocalFile } |
Macros
Detailed Description
It can parse and construct URLs in both encoded and unencoded form. QUrl also has support for internationalized domain names (IDNs).
The most common way to use QUrl is to initialize it via the constructor by passing a QString. Otherwise, setUrl() can also be used.
URLs can be represented in two forms: encoded or unencoded. The unencoded representation is suitable for showing to users, but the encoded representation is typically what you would send to a web server. For example, the unencoded URL "http://bühler.example.com/List of applicants.xml" would be sent to the server as "http://xn--bhler-kva.example.com/List%20of%20applicants.xml".
A URL can also be constructed piece by piece by calling setScheme(), setUserName(), setPassword(), setHost(), setPort(), setPath(), setQuery() and setFragment(). Some convenience functions are also available: setAuthority() sets the user name, password, host and port. setUserInfo() sets the user name and password at once.
Call isValid() to check if the URL is valid. This can be done at any point during the constructing of a URL. If isValid() returns false, you should clear() the URL before proceeding, or start over by parsing a new URL with setUrl().
Constructing a query is particularly convenient through the use of the QUrlQuery class and its methods QUrlQuery::setQueryItems(), QUrlQuery::addQueryItem() and QUrlQuery::removeQueryItem(). Use QUrlQuery::setQueryDelimiters() to customize the delimiters used for generating the query string.
For the convenience of generating encoded URL strings or query strings, there are two static functions called fromPercentEncoding() and toPercentEncoding() which deal with percent encoding and decoding of QString objects.
fromLocalFile() constructs a QUrl by parsing a local file path. toLocalFile() converts a URL to a local file path.
The human readable representation of the URL is fetched with toString(). This representation is appropriate for displaying a URL to a user in unencoded form. The encoded form however, as returned by toEncoded(), is for internal use, passing to web servers, mail clients and so on. Both forms are technically correct and represent the same URL unambiguously -- in fact, passing either form to QUrl's constructor or to setUrl() will yield the same QUrl object.
QUrl conforms to the URI specification from RFC 3986 (Uniform Resource Identifier: Generic Syntax), and includes scheme extensions from RFC 1738 (Uniform Resource Locators). Case folding rules in QUrl conform to RFC 3491 (Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)). It is also compatible with the file URI specification from freedesktop.org, provided that the locale encodes file names using UTF-8 (required by IDN).
Relative URLs vs Relative Paths
Calling isRelative() will return whether or not the URL is relative. A relative URL has no scheme. For example:
qDebug() << QUrl("main.qml").isRelative(); // true: no scheme
qDebug() << QUrl("qml/main.qml").isRelative(); // true: no scheme
qDebug() << QUrl("file:main.qml").isRelative(); // false: has "file" scheme
qDebug() << QUrl("file:qml/main.qml").isRelative(); // false: has "file" scheme
Notice that a URL can be absolute while containing a relative path, and vice versa:
// Absolute URL, relative path
QUrl url("file:file.txt");
qDebug() << url.isRelative(); // false: has "file" scheme
qDebug() << QDir::isAbsolutePath(url.path()); // false: relative path
// Relative URL, absolute path
url = QUrl("/home/user/file.txt");
qDebug() << url.isRelative(); // true: has no scheme
qDebug() << QDir::isAbsolutePath(url.path()); // true: absolute path
A relative URL can be resolved by passing it as an argument to resolved(), which returns an absolute URL. isParentOf() is used for determining whether one URL is a parent of another.
Error checking
QUrl is capable of detecting many errors in URLs while parsing it or when components of the URL are set with individual setter methods (like setScheme(), setHost() or setPath()). If the parsing or setter function is successful, any previously recorded error conditions will be discarded.
By default, QUrl setter methods operate in QUrl::TolerantMode, which means they accept some common mistakes and mis-representation of data. An alternate method of parsing is QUrl::StrictMode, which applies further checks. See QUrl::ParsingMode for a description of the difference of the parsing modes.
QUrl only checks for conformance with the URL specification. It does not try to verify that high-level protocol URLs are in the format they are expected to be by handlers elsewhere. For example, the following URIs are all considered valid by QUrl, even if they do not make sense when used:
- "http:/filename.html"
- "mailto://example.com"
When the parser encounters an error, it signals the event by making isValid() return false and toString() / toEncoded() return an empty string. If it is necessary to show the user the reason why the URL failed to parse, the error condition can be obtained from QUrl by calling errorString(). Note that this message is highly technical and may not make sense to end-users.
QUrl is capable of recording only one error condition. If more than one error is found, it is undefined which error is reported.
Character Conversions
Follow these rules to avoid erroneous character conversion when dealing with URLs and strings:
- When creating a QString to contain a URL from a QByteArray or a char*, always use QString::fromUtf8().
Member Type Documentation
enum QUrl::ComponentFormattingOption
The component formatting options define how the components of an URL will be formatted when written out as text. They can be combined with the options from QUrl::FormattingOptions when used in toString() and toEncoded().
| Constant | Value | Description |
|---|---|---|
QUrl::PrettyDecoded | 0x000000 | The component is returned in a "pretty form", with most percent-encoded characters decoded. The exact behavior of PrettyDecoded varies from component to component and may also change from Qt release to Qt release. This is the default. |
QUrl::EncodeSpaces | 0x100000 | Leave space characters in their encoded form ("%20"). |
QUrl::EncodeUnicode | 0x200000 | Leave non-US-ASCII characters encoded in their UTF-8 percent-encoded form (e.g., "%C3%A9" for the U+00E9 codepoint, LATIN SMALL LETTER E WITH ACUTE). |
QUrl::EncodeDelimiters | 0x400000 | 0x800000 | Leave certain delimiters in their encoded form, as would appear in the URL when the full URL is represented as text. The delimiters are affected by this option change from component to component. This flag has no effect in toString() or toEncoded(). |
QUrl::EncodeReserved | 0x1000000 | Leave US-ASCII characters not permitted in the URL by the specification in their encoded form. This is the default on toString() and toEncoded(). |
QUrl::DecodeReserved | 0x2000000 | Decode the US-ASCII characters that the URL specification does not allow to appear in the URL. This is the default on the getters of individual components. |
QUrl::FullyEncoded | EncodeSpaces | EncodeUnicode | EncodeDelimiters | EncodeReserved | Leave all characters in their properly-encoded form, as this component would appear as part of a URL. When used with toString(), this produces a fully-compliant URL in QString form, exactly equal to the result of toEncoded() |
QUrl::FullyDecoded | FullyEncoded | DecodeReserved | 0x4000000 | Attempt to decode as much as possible. For individual components of the URL, this decodes every percent encoding sequence, including control characters (U+0000 to U+001F) and UTF-8 sequences found in percent-encoded form. Use of this mode may cause data loss, see below for more information. |
The values of EncodeReserved and DecodeReserved should not be used together in one call. The behavior is undefined if that happens. They are provided as separate values because the behavior of the "pretty mode" with regards to reserved characters is different on certain components and specially on the full URL.
Full decoding
The FullyDecoded mode is similar to the behavior of the functions returning QString in Qt 4.x, in that every character represents itself and never has any special meaning. This is true even for the percent character ('%'), which should be interpreted to mean a literal percent, not the beginning of a percent-encoded sequence. The same actual character, in all other decoding modes, is represented by the sequence "%25".
Whenever re-applying data obtained with QUrl::FullyDecoded into a QUrl, care must be taken to use the QUrl::DecodedMode parameter to the setters (like setPath() and setUserName()). Failure to do so may cause re-interpretation of the percent character ('%') as the beginning of a percent-encoded sequence.
This mode is quite useful when portions of a URL are used in a non-URL context. For example, to extract the username, password or file paths in an FTP client application, the FullyDecoded mode should be used.
This mode should be used with care, since there are two conditions that cannot be reliably represented in the returned QString. They are:
- Non-UTF-8 sequences: URLs may contain sequences of percent-encoded characters that do not form valid UTF-8 sequences. Since URLs need to be decoded using UTF-8, any decoder failure will result in the QString containing one or more replacement characters where the sequence existed.
- Encoded delimiters: URLs are also allowed to make a distinction between a delimiter found in its literal form and its equivalent in percent-encoded form. This is most commonly found in the query, but is permitted in most parts of the URL.
The following example illustrates the problem:
QUrl original("http://example.com/?q=a%2B%3Db%26c");
QUrl copy(original);
copy.setQuery(copy.query(QUrl::FullyDecoded), QUrl::DecodedMode);
qDebug() << original.toString(); // prints: http://example.com/?q=a%2B%3Db%26c
qDebug() << copy.toString(); // prints: http://example.com/?q=a+=b&c
If the two URLs were used via HTTP GET, the interpretation by the web server would probably be different. In the first case, it would interpret as one parameter, with a key of "q" and value "a+=b&c". In the second case, it would probably interpret as two parameters, one with a key of "q" and value "a =b", and the second with a key "c" and no value.
This enum was introduced or modified in Qt 5.0.
See also QUrl::FormattingOptions.
enum QUrl::ParsingMode
The parsing mode controls the way QUrl parses strings.
| Constant | Value | Description |
|---|---|---|
QUrl::TolerantMode | 0 | QUrl will try to correct some common errors in URLs. This mode is useful for parsing URLs coming from sources not known to be strictly standards-conforming. |
QUrl::StrictMode | 1 | Only valid URLs are accepted. This mode is useful for general URL validation. |
QUrl::DecodedMode | 2 | QUrl will interpret the URL component in the fully-decoded form, where percent characters stand for themselves, not as the beginning of a percent-encoded sequence. This mode is only valid for the setters setting components of a URL; it is not permitted in the QUrl constructor, in fromEncoded() or in setUrl(). For more information on this mode, see the documentation for QUrl::FullyDecoded. |
In TolerantMode, the parser has the following behaviour:
- Spaces and "%20": unencoded space characters will be accepted and will be treated as equivalent to "%20".
- Single "%" characters: Any occurrences of a percent character "%" not followed by exactly two hexadecimal characters (e.g., "13% coverage.html") will be replaced by "%25". Note that one lone "%" character will trigger the correction mode for all percent characters.
- Reserved and unreserved characters: An encoded URL should only contain a few characters as literals; all other characters should be percent-encoded. In TolerantMode, these characters will be accepted if they are found in the URL: space / double-quote / "<" / ">" / "" / "^" / "`" / "{" / "|" / "}" Those same characters can be decoded again by passing QUrl::DecodeReserved to toString() or toEncoded(). In the getters of individual components, those characters are often returned in decoded form.
When in StrictMode, if a parsing error is found, isValid() will return false and errorString() will return a message describing the error. If more than one error is detected, it is undefined which error gets reported.
Note that TolerantMode is not usually enough for parsing user input, which often contains more errors and expectations than the parser can deal with. When dealing with data coming directly from the user -- as opposed to data coming from data-transfer sources, such as other programs -- it is recommended to use fromUserInput().
See also fromUserInput(), setUrl(), toString(), toEncoded(), and QUrl::FormattingOptions.
enum QUrl::UrlFormattingOption
The formatting options define how the URL is formatted when written out as text.
| Constant | Value | Description |
|---|---|---|
QUrl::None | 0x0 | The format of the URL is unchanged. |
QUrl::RemoveScheme | 0x1 | The scheme is removed from the URL. |
QUrl::RemovePassword | 0x2 | Any password in the URL is removed. |
QUrl::RemoveUserInfo | RemovePassword | 0x4 | Any user information in the URL is removed. |
QUrl::RemovePort | 0x8 | Any specified port is removed from the URL. |
QUrl::RemoveAuthority | RemoveUserInfo | RemovePort | 0x10 | |
QUrl::RemovePath | 0x20 | The URL's path is removed, leaving only the scheme, host address, and port (if present). |
QUrl::RemoveQuery | 0x40 | The query part of the URL (following a '?' character) is removed. |
QUrl::RemoveFragment | 0x80 | |
QUrl::RemoveFilename | 0x800 | The filename (i.e. everything after the last '/' in the path) is removed. The trailing '/' is kept, unless StripTrailingSlash is set. Only valid if RemovePath is not set. |
QUrl::PreferLocalFile | 0x200 | If the URL is a local file according to isLocalFile() and contains no query or fragment, a local file path is returned. |
QUrl::StripTrailingSlash | 0x400 | The trailing slash is removed from the path, if one is present. |
QUrl::NormalizePathSegments | 0x1000 | Modifies the path to remove redundant directory separators, and to resolve "."s and ".."s (as far as possible). For non-local paths, adjacent slashes are preserved. |
Note that the case folding rules in Nameprep, which QUrl conforms to, require host names to always be converted to lower case, regardless of the Qt::FormattingOptions used.
The options from QUrl::ComponentFormattingOptions are also possible.
See also QUrl::ComponentFormattingOptions.
enum QUrl::UserInputResolutionOption
The user input resolution options define how fromUserInput() should interpret strings that could either be a relative path or the short form of a HTTP URL. For instance file.pl can be either a local file or the URL http://file.pl.
| Constant | Value | Description |
|---|---|---|
QUrl::DefaultResolution | 0 | The default resolution mechanism is to check whether a local file exists, in the working directory given to fromUserInput, and only return a local path in that case. Otherwise a URL is assumed. |
QUrl::AssumeLocalFile | 1 | This option makes fromUserInput() always return a local path unless the input contains a scheme, such as http://file.pl. This is useful for applications such as text editors, which are able to create the file if it doesn't exist. |
This enum was introduced or modified in Qt 5.4.
See also fromUserInput().
Macro Documentation
QT_NO_URL_CAST_FROM_STRING
Disables automatic conversions from QString (or char *) to QUrl.
Compiling your code with this define is useful when you have a lot of code that uses QString for file names and you wish to convert it to use QUrl for network transparency. In any code that uses QUrl, it can help avoid missing QUrl::resolved() calls, and other misuses of QString to QUrl conversions.
For example, if you have code like
url = filename; // probably not what you want
you can rewrite it as
url = QUrl::fromLocalFile(filename); url = baseurl.resolved(QUrl(filename));
See also QT_NO_CAST_FROM_ASCII.