QStringView Diaries: Masters Of The Overloads How QStringView actively manages implicit conversions
The last blog post in this series described how to use string-views. This post is about how to design one. In particular, it’s about QStringView
‘s constructors. They evolved through a rapid succession of changes. These changes either fixed ambiguities between QString
and QStringView
overloads, or improved performance. And they all have the same solution: std::enable_if
, the Swiss Army Knife for overload control.
This post will take you from where we naïvely started to where we made the impossible possible: overloading a function for arrays and pointers.
The Naïve Beginning
How do you design a string-view class? We, at least, started by making a list of types that should be implicitly convertible to the string-view:
QString
, of courseQStringRef
std::u16string
char16_t
literals:u"Hello"
const QChar*
andconst char16_t*
, with and without an explicit size argument- Same for
ushort
, because Qt uses that in many low-level APIs - On Windows:
std::wstring
,wchar_t
literals (L"Hello"
),const wchar_t*
(sincewchar_t
is 2 bytes on Windows, and because we need to still support MSVC 2013, which does not know aboutchar16_t
) - …
Next, pour that into a list of non-explicit constructors:
QStringView() : m_data(nullptr), m_size(0) {} QStringView(std::nullptr_t) : QStringView() {} #ifdef Q_COMPILER_UNICODE_STRINGS QStringView(const char16_t *data, qssize_t size) : m_data(data), m_size(size) {} QStringView(const char16_t *data) : QStringView(data, data ? lengthHelper(data) : 0) {} #endif #ifdef Q_OS_WIN QStringView(const wchar_t *data /** BOOOOOOOOOOORING!! **/
Enable_If, The First: Implementation Convenience
If you’re like me, you get bored with repeating the same constructors for char16_t
, wchar_t
, ushort
and QChar
, #ifdef
‘ing them for compiler support and platforms. And you write a template instead. Which uses std::enable_if
, of course.
We first define some template aliases to make the use of enable_if
a bit more readable:
template <typename Char> using if_compatible_char = typename std::enable_if<QtPrivate::IsCompatibleChar<Char>::value>::type*; template <typename String> using if_compatible_stdstring = typename std::enable_if<QtPrivate::IsCompatibleStdBasicString<String>::value>::type*;
For all the gory details of QtPrivate::IsCompatibleChar
and QtPrivate::IsCompatibleStdBasicString
see here and here, respectively.
With this, we can implement:
QStringView() : m_data(nullptr), m_size(0) {} QStringView(std::nullptr_t) : QStringView() {} template <typename Char, typename = if_compatible_char<Char>> QStringView(const Char *data, qssize_t size) : m_data(castHelper(data)), m_size(size) {} template <typename Char, typename = if_compatible_char<Char>> QStringView(const Char *data) : m_data(castHelper(data)), m_size(data ? lengthHelper(data) : 0) {} template <typename String, typename = if_compatible_stdstring<String>> QStringView(const String &str) : QStringView(str.data(), qssize_t(str.size())) {} QStringView(const QString &str) : QStringView(str.isNull() ? nullptr : str.data(), qssize_t(str.size()) {} QStringView(const QStringRef &str) : QStringView(str.isNull() ? nullptr : str.data(), qssize_t(str.size()) {}
The idea here is that enable_if
only defines the nested typedef type
if its template argument evaluates to true. If it evaluates to false, the request for ::type
will fail, but because of SFINAE, this will not be an error. The template is simply not considered (removed from the set of possible overloads).
It compiles! Ship it!.
Working Around MSVC 2013, The First
Unfortunately, MSVC 2013 does not like that particular use of enable_if
:
tst_qstringview.cpp(119) : error C2338: CanConvert<std::wstring>::value == CanConvertFromWCharT tst_qstringview.cpp(120) : error C2338: CanConvert<const std::wstring>::value == CanConvertFromWCharT tst_qstringview.cpp(121) : error C2338: CanConvert<std::wstring&>::value == CanConvertFromWCharT tst_qstringview.cpp(122) : error C2338: CanConvert<const std::wstring&>::value == CanConvertFromWCharT
With the above error message, MSVC tells us that a static_assert
fails. This assertion is from the QStringView
unit test. It’s checking that QStringView
can be constructed from a std::wstring
. And that fails. MSVC 2013 does not support std::u16string
, so it’s basically all std::basic_string
constructors that do not work. MSVC 2017 works perfectly well.
Thankfully, The Qt Company recently hired Ville Voutilainen, the C++ Evolution Working Group Chair, so we can now pick his brain on such matters. He suggested to use a slightly different approach, which yours truly folded into the existing type alias as follows:
template <typename Char> using if_compatible_char = typename std::enable_if<QtPrivate::IsCompatibleChar<Char>::value, bool>::type; template <typename String> using if_compatible_stdstring = typename std::enable_if<QtPrivate::IsCompatibleStdBasicString<String>::value, bool>::type; template <typename Char, if_compatible_char<Char> = true> QStringView(const Char *data, qssize_t size) : m_data(castHelper(data)), m_size(size) {}
Do you spot the difference?
Instead of using if_compatible_char
as the default value of the second, unnamed template argument, we make it the second template argument itself. In the success case, if_compatible_char
resolves to bool
, which we then default to true. In the failure case, we hit SFINAE and the template is removed from the overload set, as before.
It compiles! Even on MSVC 2013! Ship it!
Working Around MSVC 2013, The Second
Next, we hit a bug where MSVC 2013 allows two user-defined conversions when matching a function argument to the function’s parameter types. This is apparently well-known, but kept around for compatibility reasons, with a gradual removal path over the next few MSVC versions.
The bug manifested itself by QStringView
overloads accepting, say, L'x'
and ushort
s via—brace yourself—QString(QChar(int(L'x'))
and QString(QChar(ushortValue))
, respectively. Yours truly decided to tackle that by adding deleted QStringView
constructors for all the types QString
accepts, but we didn’t want:
QStringView(QLatin1String) = delete; QStringView(const char *) = delete; template <typename Char, if_compatible_char<Char> = true> QStringView(Char) = delete; // ...
This way, an attempt to construct a QStringView
from, say, a QChar
, or anything convertible to QChar
, would hit one of the deleted constructors and be rejected by the compiler.
This eventually got merged, and appeared to work for a while.
Until we started to add QStringView
overloads to existing functions taking QString
.
Ambiguous Overloads
Consider a function taking QString
overloaded with a function taking QStringView
:
bool isValidIdentifier(const QString &id); // pre-existing bool isValidIdentifier(QStringView id); // newly added
This situation will come up all the time in the transition period until Qt 6: We weed out QString
parameters for QStringView
ones, but can’t remove the QString
overloads because of binary compatibility.
Now consider these perfectly fine existing calls:
isValidIdentifier(QLatin1String("QString")); // OK, implicitly converts to QString isValidIdentifier("QString"); // OK, ditto, unless QT_NO_CAST_FROM_ASCII is defined isValidIdentifier(QChar('x')); // OK, ditto
As long as there was only the QString
overload, these worked fine. Add a QStringView
overload, and they all become ambiguous, because QStringView(QLatin1String)
is just as good a conversion as QString(QLatin1String)
. That the QStringView
constructor is deleted is only checked after overload resolution. But we never get there, because the overload is ambiguous.
In both cases, the triggering of the MSVC issue of allowing two user-defined conversions, as well as the ambiguous-overload problem, the QStringView(QString)
constructor is root cause. Because it is a normal function, it is susceptible to implicit conversions.
Our problems would be solved if QStringView(QString)
only accepted QString
s, as opposed to “everything convertible to QString
“.
Rule of thumb: if you want to rule out implicit conversions, use a template function that takes its argument by reference.
Enable_If, The Second: Managing Conversions
So, let’s kill two birds with one stone by making the QStringView(QString)
constructor a template:
template <typename QStringLike, if_compatible_qstring_like<QStringLike> = true> QStingView(const QStringLike &str) : QStringView(str.isNull() ? nullptr : str.data(), qssize_t(str.size())) {}
where if_compatible_qstring_like
matches only QString
and QStringRef
.
It is worthwhile to pause here and take a look at these two “overloads”:
template <typename QStringLike ...> QString(const QStringLike &str); template <typename StdString ...> QString(const StdString &str);
Apart from the name of the template argument, the signatures are identical. The only reason why we can overload them at all is because their respective enable_if
conditions are never both true for the same argument types. Such is the power of enable_if
.
Let’s take this idea to the extreme now:
Detecting String Literals
Consider this call:
isValidIdentifier(u"QString");
We want to achieve that the string-view construction is done completely at compile-time. In particular, we want the length of the string to be calculated at compile- and not at runtime.
At our current point in QStringView
development in this article, the call resolves to the following constructor, which reads, with full C++11 decoration and including its helper function:
template <typename Char> #if __cplusplus >= 201402L // C++14 constexpr #endif qssize_t lengthHelper(const Char *data) { qssize_t result = 0; while (*data++) ++result; return result; } template <typename Char, if_compatible_char<Char> = true> constexpr QStringView(const Char *str) noexcept : QStringView(str, str ? lengthHelper(str) : 0) {}
with Char
deduced as char16_t
.
So, with a C++14 compiler, lengthHelper()
is constexpr
and thus evaluated at compile-time, as we desired.
But in C++11 mode, lengthHelper()
is not constexpr
. Yes, the constructor is constexpr
. But it calls a function which is not constexpr
. How does that even compile?
Well, first of all, it’s not a function, but a function template. To a first approximation, you can simply slap a constexpr
onto every function template. The compiler will silently drop it if for a given instantiation it would not be allowed.
Constexpr Magic
But we don’t even need that rule. For every Char
this constructor is constexpr
, even in C++11.
The reason this is so (and thankfully even MSVC implements it that way) is a nice (if you’re so inclined) special rule for constexpr
functions. If there is even one argument that you could potentially pass to the function to make the body a constant expression, the whole function can be marked as constexpr
.
And this is the case here. If data == nullptr
, we invoke QStringView(nullptr, 0)
, which is a generalised constant expression. This is the only reason why we put the nullptr
check into the constructor instead of lengthHelper()
.
Ok, so where are we?
We have understood why the constexpr
keyword on the QStringView(const Char*)
constructor is allowed (there are two reasons for this) and have seen that the expression QStringView(u"QString")
is constexpr
if and only if we compile in at least C++14 mode.
Towards Constexpr in C++11
But we really really want the expression to be constexpr
even in C++11.
Taking a cue from std::size()
for arrays, we could get the idea to add an array-overload like this:
template <typename Char, qssize_t N, if_compatible_char<Char> = true> constexpr QStringView(const Char (&array)[N]) : QStringView(array, N - 1) {}
(the reference is needed, see your favourite C++ templates text book for why; the -1
is to strip trailing NULs). This is indeed constexpr
even in C++11, and for all string literals.
But it doesn’t overload with the QStringView(const Char *)
constructor. Compiler diagnostics range from misleading to confusing, see GCC, Clang, MSVC @ godbolt.
But by now, we know what to do when we want to overload, but can’t, don’t we?
You guessed it: enable_if
to the rescue.
Enable_If, The Third: Distinguish Between Arrays and Pointers
So we need to write enable_if
conditions that distinguish between arrays and pointers. To get that information into enable_if
in the first place, we again need to take a template argument by reference:
template <typename Array, if_compatible_array<Array> = true> QStringView(const Array &array) : QStringView(array, std::size(array) - 1) {} template <typename Pointer, if_compatible_pointer<Pointer> = true> QStringView(const Pointer &ptr) : QStringView(ptr, ptr ? lengthHelper(ptr) : 0) {}
Once you have these, writing if_compatible_array
and if_compatible_pointer
and their helpers is rather straightforward (solution in the embedded links).
That’s it for today. Thank you for reading this far.
Lessons Learned
- You can overload any two types if you make the function a template, take by reference, and add
enable_if
with disjoint conditions. - Hiding
enable_if
behind a type alias of the formif_condition<Args...> = true
is by far the most readable way to constrain templates. It may be even nicer than Concepts Lite’s verboserequires
clauses. QString
has way too many implicit constructors.
You can follow QStringView development on this blog and on Gerrit.
Stay tuned!
Well that was really fascinating read! Thank you.
Nice read! Small syntax errors? “…har::value>::type*;” s/\*/>/
Thanks. No, it’s not an error. It’s making a pointer out of it (historic reasons). The angle brackets are balanced.
Here’s an alternative to avoid the appearance of the array as a pointer:
QStringView(const Char * const& data)
This would be attractive in code that has not yet gone the way to use enable_if. But since you already have, O might also have preferred your approach over the above. Just for the record.
Another thing that came to mind: You could overload the constructors of the QStringView with “const QChar&”, and take the address of that reference and write it into the QStringView pointer. This could work, but unfortunately it would be ambiguous when we overload together with QString. A quick-fix would be to make this constructor overload “explicit QStringView(const QChar& c)”. Then passing a QChar to a QString/QStringView overloaded function will pick the QString version.
As for the length determination of the array – I would *not* do that, because it has surprising behavior:
char x[255];
sprintf(x, “not %d chars”, 254);
QStringView qv = x;
Every user would expect qv.size() to be much smaller than 254.