QString Class

The QString class provides a Unicode character string. More...

Header: #include <QString>
qmake: QT += core

Note: All functions in this class are reentrant.

Public Types

typedef ConstIterator
typedef Iterator
enum NormalizationForm { NormalizationForm_D, NormalizationForm_C, NormalizationForm_KD, NormalizationForm_KC }
enum SectionFlag { SectionDefault, SectionSkipEmpty, SectionIncludeLeadingSep, SectionIncludeTrailingSep, SectionCaseInsensitiveSeps }
typedef const_iterator
typedef const_pointer
typedef const_reference
typedef const_reverse_iterator
typedef difference_type
typedef iterator
typedef pointer
typedef reference
typedef reverse_iterator
typedef size_type
typedef value_type

Macros

Detailed Description

QString stores a string of 16-bit QChars, where each QChar corresponds to one UTF-16 code unit. (Unicode characters with code values above 65535 are stored using surrogate pairs, i.e., two consecutive QChars.)

Unicode is an international standard that supports most of the writing systems in use today. It is a superset of US-ASCII (ANSI X3.4-1986) and Latin-1 (ISO 8859-1), and all the US-ASCII/Latin-1 characters are available at the same code positions.

Behind the scenes, QString uses implicit sharing (copy-on-write) to reduce memory usage and to avoid the needless copying of data. This also helps reduce the inherent overhead of storing 16-bit characters instead of 8-bit characters.

In addition to QString, Qt also provides the QByteArray class to store raw bytes and traditional 8-bit '\0'-terminated strings. For most purposes, QString is the class you want to use. It is used throughout the Qt API, and the Unicode support ensures that your applications will be easy to translate if you want to expand your application's market at some point. The two main cases where QByteArray is appropriate are when you need to store raw binary data, and when memory conservation is critical (like in embedded systems).

Initializing a String

One way to initialize a QString is simply to pass a const char * to its constructor. For example, the following code creates a QString of size 5 containing the data "Hello":

 QString str = "Hello";

QString converts the const char * data into Unicode using the fromUtf8() function.

In all of the QString functions that take const char * parameters, the const char * is interpreted as a classic C-style '\0'-terminated string encoded in UTF-8. It is legal for the const char * parameter to be nullptr.

You can also provide string data as an array of QChars:

 static const QChar data[4] = { 0x0055, 0x006e, 0x10e3, 0x03a3 };
 QString str(data, 4);

QString makes a deep copy of the QChar data, so you can modify it later without experiencing side effects. (If for performance reasons you don't want to take a deep copy of the character data, use QString::fromRawData() instead.)

Another approach is to set the size of the string using resize() and to initialize the data character per character. QString uses 0-based indexes, just like C++ arrays. To access the character at a particular index position, you can use operator[](). On non-const strings, operator[]() returns a reference to a character that can be used on the left side of an assignment. For example:

 QString str;
 str.resize(4);

 str[0] = QChar('U');
 str[1] = QChar('n');
 str[2] = QChar(0x10e3);
 str[3] = QChar(0x03a3);

For read-only access, an alternative syntax is to use the at() function:

 QString str;

 for (int i = 0; i < str.size(); ++i) {
     if (str.at(i) >= QChar('a') && str.at(i) <= QChar('f'))
         qDebug() << "Found character in range [a-f]";
 }

The at() function can be faster than operator[](), because it never causes a deep copy to occur. Alternatively, use the left(), right(), or mid() functions to extract several characters at a time.

A QString can embed '\0' characters (QChar::Null). The size() function always returns the size of the whole string, including embedded '\0' characters.

After a call to the resize() function, newly allocated characters have undefined values. To set all the characters in the string to a particular value, use the fill() function.

QString provides dozens of overloads designed to simplify string usage. For example, if you want to compare a QString with a string literal, you can write code like this and it will work as expected:

 QString str;

 if (str == "auto" || str == "extern"
         || str == "static" || str == "register") {
     // ...
 }

You can also pass string literals to functions that take QStrings as arguments, invoking the QString(const char *) constructor. Similarly, you can pass a QString to a function that takes a const char * argument using the qPrintable() macro which returns the given QString as a const char *. This is equivalent to calling <QString>.toLocal8Bit().constData().

Manipulating String Data

QString provides the following basic functions for modifying the character data: append(), prepend(), insert(), replace(), and remove(). For example:

 QString str = "and";
 str.prepend("rock ");     // str == "rock and"
 str.append(" roll");        // str == "rock and roll"
 str.replace(5, 3, "&");   // str == "rock & roll"

If you are building a QString gradually and know in advance approximately how many characters the QString will contain, you can call reserve(), asking QString to preallocate a certain amount of memory. You can also call capacity() to find out how much memory QString actually allocated.

The replace() and remove() functions' first two arguments are the position from which to start erasing and the number of characters that should be erased. If you want to replace all occurrences of a particular substring with another, use one of the two-parameter replace() overloads.

A frequent requirement is to remove whitespace characters from a string ('\n', '\t', ' ', etc.). If you want to remove whitespace from both ends of a QString, use the trimmed() function. If you want to remove whitespace from both ends and replace multiple consecutive whitespaces with a single space character within the string, use simplified().

If you want to find all occurrences of a particular character or substring in a QString, use the indexOf() or lastIndexOf() functions. The former searches forward starting from a given index position, the latter searches backward. Both return the index position of the character or substring if they find it; otherwise, they return -1. For example, here is a typical loop that finds all occurrences of a particular substring:

 QString str = "We must be <b>bold</b>, very <b>bold</b>";
 int j = 0;

 while ((j = str.indexOf("<b>", j)) != -1) {
     qDebug() << "Found <b> tag at index position" << j;
     ++j;
 }

QString provides many functions for converting numbers into strings and strings into numbers. See the arg() functions, the setNum() functions, the number() static functions, and the toInt(), toDouble(), and similar functions.

To get an upper- or lowercase version of a string use toUpper() or toLower().

Lists of strings are handled by the QStringList class. You can split a string into a list of strings using the split() function, and join a list of strings into a single string with an optional separator using QStringList::join(). You can obtain a list of strings from a string list that contain a particular substring or that match a particular QRegExp using the QStringList::filter() function.

Querying String Data

If you want to see if a QString starts or ends with a particular substring use startsWith() or endsWith(). If you simply want to check whether a QString contains a particular character or substring, use the contains() function. If you want to find out how many times a particular character or substring occurs in the string, use count().

To obtain a pointer to the actual character data, call data() or constData(). These functions return a pointer to the beginning of the QChar data. The pointer is guaranteed to remain valid until a non-const function is called on the QString.

Comparing Strings

QStrings can be compared using overloaded operators such as operator<(), operator<=(), operator==(), operator>=(), and so on. Note that the comparison is based exclusively on the numeric Unicode values of the characters. It is very fast, but is not what a human would expect; the QString::localeAwareCompare() function is usually a better choice for sorting user-interface strings, when such a comparison is available.

On Unix-like platforms (including Linux, macOS and iOS), when Qt is linked with the ICU library (which it usually is), its locale-aware sorting is used. Otherwise, on macOS and iOS, localeAwareCompare() compares according the "Order for sorted lists" setting in the International preferences panel. On other Unix-like systems without ICU, the comparison falls back to the system library's strcoll(), falling back when it considers strings equal to QString's (locale-unaware) comparison, described above,

Converting Between 8-Bit Strings and Unicode Strings

QString provides the following three functions that return a const char * version of the string as QByteArray: toUtf8(), toLatin1(), and toLocal8Bit().

  • toLatin1() returns a Latin-1 (ISO 8859-1) encoded 8-bit string.
  • toUtf8() returns a UTF-8 encoded 8-bit string. UTF-8 is a superset of US-ASCII (ANSI X3.4-1986) that supports the entire Unicode character set through multibyte sequences.
  • toLocal8Bit() returns an 8-bit string using the system's local encoding.

To convert from one of these encodings, QString provides fromLatin1(), fromUtf8(), and fromLocal8Bit(). Other encodings are supported through the QTextCodec class.

As mentioned above, QString provides a lot of functions and operators that make it easy to interoperate with const char * strings. But this functionality is a double-edged sword: It makes QString more convenient to use if all strings are US-ASCII or Latin-1, but there is always the risk that an implicit conversion from or to const char * is done using the wrong 8-bit encoding. To minimize these risks, you can turn off these implicit conversions by defining some of the following preprocessor symbols:

  • QT_NO_CAST_FROM_ASCII disables automatic conversions from C string literals and pointers to Unicode.
  • QT_RESTRICTED_CAST_FROM_ASCII allows automatic conversions from C characters and character arrays, but disables automatic conversions from character pointers to Unicode.
  • QT_NO_CAST_TO_ASCII disables automatic conversion from QString to C strings.

You then need to explicitly call fromUtf8(), fromLatin1(), or fromLocal8Bit() to construct a QString from an 8-bit string, or use the lightweight QLatin1String class, for example:

 QString url = QLatin1String("http://www.unicode.org/");

Similarly, you must call toLatin1(), toUtf8(), or toLocal8Bit() explicitly to convert the QString to an 8-bit string. (Other encodings are supported through the QTextCodec class.)

Note for C Programmers
Due to C++'s type system and the fact that QString is implicitly shared, QStrings may be treated like ints or other basic types. For example:
 QString Widget::boolToString(bool b)
 {
     QString result;
     if (b)
         result = "True";
     else
         result = "False";
     return result;
 }

The result variable, is a normal variable allocated on the stack. When return is called, and because we're returning by value, the copy constructor is called and a copy of the string is returned. No actual copying takes place thanks to the implicit sharing.

Distinction Between Null and Empty Strings

For historical reasons, QString distinguishes between a null string and an empty string. A null string is a string that is initialized using QString's default constructor or by passing (const char *)0 to the constructor. An empty string is any string with size 0. A null string is always empty, but an empty string isn't necessarily null:

 QString().isNull();               // returns true
 QString().isEmpty();              // returns true

 QString("").isNull();             // returns false
 QString("").isEmpty();            // returns true

 QString("abc").isNull();          // returns false
 QString("abc").isEmpty();         // returns false

All functions except isNull() treat null strings the same as empty strings. For example, toUtf8().constData() returns a valid pointer (not nullptr) to a '\0' character for a null string. We recommend that you always use the isEmpty() function and avoid isNull().

Argument Formats

In member functions where an argument format can be specified (e.g., arg(), number()), the argument format can be one of the following:

FormatMeaning
eformat as [-]9.9e[+|-]999
Eformat as [-]9.9E[+|-]999
fformat as [-]9.9
guse e or f format, whichever is the most concise
Guse E or f format, whichever is the most concise

A precision is also specified with the argument format. For the 'e', 'E', and 'f' formats, the precision represents the number of digits after the decimal point. For the 'g' and 'G' formats, the precision represents the maximum number of significant digits (trailing zeroes are omitted).

More Efficient String Construction

Many strings are known at compile time. But the trivial constructor QString("Hello"), will copy the contents of the string, treating the contents as Latin-1. To avoid this one can use the QStringLiteral macro to directly create the required data at compile time. Constructing a QString out of the literal does then not cause any overhead at runtime.

A slightly less efficient way is to use QLatin1String. This class wraps a C string literal, precalculates it length at compile time and can then be used for faster comparison with QStrings and conversion to QStrings than a regular C string literal.

Using the QString '+' operator, it is easy to construct a complex string from multiple substrings. You will often write code like this:

     QString foo;
     QString type = "long";

     foo->setText(QLatin1String("vector<") + type + QLatin1String(">::iterator"));

     if (foo.startsWith("(" + type + ") 0x"))
         ...

There is nothing wrong with either of these string constructions, but there are a few hidden inefficiencies. Beginning with Qt 4.6, you can eliminate them.

First, multiple uses of the '+' operator usually means multiple memory allocations. When concatenating n substrings, where n > 2, there can be as many as n - 1 calls to the memory allocator.

In 4.6, an internal template class QStringBuilder has been added along with a few helper functions. This class is marked internal and does not appear in the documentation, because you aren't meant to instantiate it in your code. Its use will be automatic, as described below. The class is found in src/corelib/tools/qstringbuilder.cpp if you want to have a look at it.

QStringBuilder uses expression templates and reimplements the '%' operator so that when you use '%' for string concatenation instead of '+', multiple substring concatenations will be postponed until the final result is about to be assigned to a QString. At this point, the amount of memory required for the final result is known. The memory allocator is then called once to get the required space, and the substrings are copied into it one by one.

Additional efficiency is gained by inlining and reduced reference counting (the QString created from a QStringBuilder typically has a ref count of 1, whereas QString::append() needs an extra test).

There are two ways you can access this improved method of string construction. The straightforward way is to include QStringBuilder wherever you want to use it, and use the '%' operator instead of '+' when concatenating strings:

     #include <QStringBuilder>

     QString hello("hello");
     QStringRef el(&hello, 2, 3);
     QLatin1String world("world");
     QString message =  hello % el % world % QChar('!');

A more global approach which is the most convenient but not entirely source compatible, is to this define in your .pro file:

     DEFINES *= QT_USE_QSTRINGBUILDER

and the '+' will automatically be performed as the QStringBuilder '%' everywhere.

Maximum Size and Out-of-memory Conditions

The current version of QString is limited to just under 2 GB (2^31 bytes) in size. The exact value is architecture-dependent, since it depends on the overhead required for managing the data block, but is no more than 32 bytes. Raw data blocks are also limited by the use of int type in the current version to 2 GB minus 1 byte. Since QString uses two bytes per character, that translates to just under 2^30 characters in one QString.

In case memory allocation fails, QString will throw a std::bad_alloc exception. Out of memory conditions in the Qt containers are the only case where Qt will throw exceptions.

Note that the operating system may impose further limits on applications holding a lot of allocated memory, especially large, contiguous blocks. Such considerations, the configuration of such behavior or any mitigation are outside the scope of the Qt API.

See also fromRawData(), QChar, QLatin1String, QByteArray, and QStringRef.

Member Type Documentation

typedef QString::ConstIterator

Qt-style synonym for QString::const_iterator.

typedef QString::Iterator

Qt-style synonym for QString::iterator.

enum QString::NormalizationForm

This enum describes the various normalized forms of Unicode text.

ConstantValueDescription
QString::NormalizationForm_D0Canonical Decomposition
QString::NormalizationForm_C1Canonical Decomposition followed by Canonical Composition
QString::NormalizationForm_KD2Compatibility Decomposition
QString::NormalizationForm_KC3Compatibility Decomposition followed by Canonical Composition

See also normalized() and Unicode Standard Annex #15.

enum QString::SectionFlag

This enum specifies flags that can be used to affect various aspects of the section() function's behavior with respect to separators and empty fields.

ConstantValueDescription
QString::SectionDefault0x00Empty fields are counted, leading and trailing separators are not included, and the separator is compared case sensitively.
QString::SectionSkipEmpty0x01Treat empty fields as if they don't exist, i.e. they are not considered as far as start and end are concerned.
QString::SectionIncludeLeadingSep0x02Include the leading separator (if any) in the result string.
QString::SectionIncludeTrailingSep0x04Include the trailing separator (if any) in the result string.
QString::SectionCaseInsensitiveSeps0x08Compare the separator case-insensitively.

See also section().

typedef QString::const_iterator

See also QString::iterator.

typedef QString::const_pointer

The QString::const_pointer typedef provides an STL-style const pointer to a QString element (QChar).

typedef QString::const_reference

typedef QString::const_reverse_iterator

This typedef was introduced in Qt 5.6.

See also QString::reverse_iterator and QString::const_iterator.

typedef QString::difference_type

typedef QString::iterator

See also QString::const_iterator.

typedef QString::pointer

The QString::const_pointer typedef provides an STL-style pointer to a QString element (QChar).

typedef QString::reference

typedef QString::reverse_iterator

This typedef was introduced in Qt 5.6.

See also QString::const_reverse_iterator and QString::iterator.

typedef QString::size_type

typedef QString::value_type

Macro Documentation

QStringLiteral(str)

The macro generates the data for a QString out of the string literal str at compile time. Creating a QString from it is free in this case, and the generated string data is stored in the read-only segment of the compiled object file.

If you have code that looks like this:

 // hasAttribute takes a QString argument
 if (node.hasAttribute("http-contents-length")) //...

then a temporary QString will be created to be passed as the hasAttribute function parameter. This can be quite expensive, as it involves a memory allocation and the copy/conversion of the data into QString's internal encoding.

This cost can be avoided by using QStringLiteral instead:

 if (node.hasAttribute(QStringLiteral(u"http-contents-length"))) //...

In this case, QString's internal data will be generated at compile time; no conversion or allocation will occur at runtime.

Using QStringLiteral instead of a double quoted plain C++ string literal can significantly speed up creation of QString instances from data known at compile time.

Note: QLatin1String can still be more efficient than QStringLiteral when the string is passed to a function that has an overload taking QLatin1String and this overload avoids conversion to QString. For instance, QString::operator==() can compare to a QLatin1String directly:

 if (attribute.name() == QLatin1String("http-contents-length")) //...

Note: Some compilers have bugs encoding strings containing characters outside the US-ASCII character set. Make sure you prefix your string with u in those cases. It is optional otherwise.

See also QByteArrayLiteral.

QT_NO_CAST_FROM_ASCII

Disables automatic conversions from 8-bit strings (char *) to unicode QStrings

See also QT_NO_CAST_TO_ASCII, QT_RESTRICTED_CAST_FROM_ASCII, and QT_NO_CAST_FROM_BYTEARRAY.

QT_NO_CAST_TO_ASCII

Disables automatic conversion from QString to 8-bit strings (char *).

See also QT_NO_CAST_FROM_ASCII, QT_RESTRICTED_CAST_FROM_ASCII, and QT_NO_CAST_FROM_BYTEARRAY.

QT_RESTRICTED_CAST_FROM_ASCII

Disables most automatic conversions from source literals and 8-bit data to unicode QStrings, but allows the use of the QChar(char) and QString(const char (&ch)[N] constructors, and the QString::operator=(const char (&ch)[N]) assignment operator. This gives most of the type-safety benefits of QT_NO_CAST_FROM_ASCII but does not require user code to wrap character and string literals with QLatin1Char, QLatin1String or similar.

Using this macro together with source strings outside the 7-bit range, non-literals, or literals with embedded NUL characters is undefined.

See also QT_NO_CAST_FROM_ASCII and QT_NO_CAST_TO_ASCII.