Qt and Trivial Relocation (Part 4) On trivial relocation and move assignments
In the last post of this series we learned that:
- erasing elements from the middle of a vector can be implemented, in general, via a series of move assignments, move constructions, swaps, destructions
- for types with value semantics, the exact strategy does not really matter
- for types with write-through reference semantics, the strategy matters, because the outcome is different
- for this reason, the Standard Library specifies exactly which strategy is employed: move assignments and destructions
- Qt optimizes erasure of trivially relocatable types by using memmove to compact the tail.
We ended up realizing that this creates a contradiction: a type with write-through reference semantics like std::tuple<int &> is trivially relocatable (move construction and destruction of the source can be replaced by moving bytes), but it cannot be erased via byte manipulations. However, if we were to mark that type as trivial relocatable (via Q_DECLARE_TYPEINFO) then Qt would use byte manipulations, breaking our programs!
The conclusion of the last post was that we need to change something in our models:
- maybe std::vector should use a different strategy when erasing elements
- maybe types like std::tuple<int &> should not be allowed to be stored in a vector
- maybe Qt should not be using memmove when erasing objects of trivially relocatable type (but it can still optimize the reallocation of a vector)
- maybe Qt’s definition of trivial relocability does not match ours, and we need to fix our definitions.
In this post we will explore these possibilities and reach some conclusions.
Changing the behavior of vector erasure
As we have already discussed in the previous blog posts, it is possible to implement erasure in a number of ways, which are not equivalent. The Standard Library chose a specific implementation strategy (move-assign the elements after the ones to be destroyed to the left; destroy the moved-from last elements). Changing it now, over 26 years after the fact, sounds extremely scary; std::vector is such a central class that surely such a change would break somebody’s code.
That doesn’t mean that we can’t at least reason about a possible change there!
Also, there is another library that we keep talking about in these blog posts. This other library has a much smaller userbase than the Standard Library, and that has fewer regards with breaking backwards compatibility. We should certainly reason about that library as well. I’m talking about Qt, of course 🙂
Change the behavior of vector erasure what, exactly?
In order to reconcile our definition of (trivial) relocatability with the behavior of erase, we would need to implement erase in terms of move constructions and destructions.
Thankfully, this can be done. Suppose again we want to erase the second element from a vector:
We can implement this by destroying the second element, and move-constructing the third element over it:
Then we destroy the moved-from third element, and move-construct the fourth element over it.
We keep going until the end, where we have just a moved-from element to destroy:
Erasure can use relocation
The fact that this particular implementation strategy is viable gives us some very interesting insights.
Let’s look at the sequence of operations once again:
- Destroy the element at position 1 (B)
- Move construct the element at position 2 into position 1 (C)
- Destroy the element at position 2 (moved-from C)
- Move construct the element at position 3 into position 2 (D)
- Destroy the element at position 3 (moved-from D)
We can look at this sequence in two different ways:
- firstly, as pairs of “destroy + move construct” operations, leaving a last element to destroy (1+2, 3+4, 5)
- or, more interestingly, as an initial destruction, followed by a series of relocations. (1, 2+3, 4+5).
Move assignments for value types
Let’s analyze the first way: pairs of “destruction + move construction” operations. This analysis will allow us to grasp the fundamental difference between value types (like QString, or std::unique_ptr<T>) from reference types (like std::tuple<int &>).
We have already established that for a value type it does not matter what the erasure strategy is; the outcome is the same. For a write-through reference type, the strategy matters; using move assignments instead of destructions/move constructions pairs will yield a different result.
We can now appreciate the difference: for a value type, a move assignment is equivalent to destruction of the target, followed by move construction. This is why, when dealing with value types, using the above strategy for erasure gives us the same results as a strategy based on move assignments (like the one that std::vector uses).
For a write-through reference type, this is not true: a move assignment is not equivalent to destroying the target and move-constructing the source in its place. Again, consider std::tuple<int &>:
int a = 1; int b = 2; using T = std::tuple<int &>; T ta(a); T tb(b); #if USE_MOVE_ASSIGN // move-assignment from tb ta = std::move(tb); // a = 2, b = 2 #else // destroy ta + move-construct from tb ta.~T(); new (&ta) T(std::move(tb)); // a = 1, b = 2 #endif
It is worth noticing the behavior of move assigments is completely orthogonal to whether a type is trivially relocatable or not. Both QString and std::tuple<int &>: are trivially relocatable (according to our prior definition), but QString has value semantics, std::tuple<int &>: has not. In other words, this is another property of a type which cannot be “detected” from the outside the type itself (just like trivial relocation cannot be detected).
Trivial relocation and erasure
Let’s now tackle the second way of looking at erasure: destruction of an element, followed by a series of relocations.
This means that if a type is trivially relocatable, then the erasure strategy shown above can be implemented by an initial destruction followed by a memory copy for each subsequent element. Of course, these byte copies can be coalesced in a single memmove of the tail:
This is precisely what Qt does when erasing elements of a trivially relocatable type! At least now we have a justification for it.
So which erasure strategy does QVector use exactly?
If Qt uses trivial relocation to erase elements from a QVector of trivially relocatable types, does this mean that QVector also uses (non-trivial) relocation when erasing objects of other types?
Well… no, not really. Qt is way, way, way less strict than the C++ Standard Library: the exact strategies for erasure, insertion etc. are simply undocumented and can and will change at any time. (In fact, as I’m writing this blog post, I can see commits that change these algorithms…) The strategies are also inconsistent between different containers.
For instance, consider a non-empty QVector of size S. If I erase the first element, exactly how many operations are going to happen? 𝒪(S), right?
That would be true in Qt 5, but it’s not true any more in Qt 6! In Qt 6 the operation is 𝒪(1): QVector in Qt 6 has a “prepend optimization”: there may be empty capacity at both ends of the storage. So, in order to erase the first element, it’s sufficient to destroy it and update bookkeeping (move the “begin” pointer forward). This is just one amongst many other similar examples where Qt containers have changed behavior between Qt releases; other changes do happen even between minor releases.
The corollary is of this lack of guarantees is quite strong: do not ever store types that don’t have value semantics in Qt containers! Qt will break your programs when its containers change behavior. It’s a when, not an if.
Thankfully, the majority of types we deal with in Qt applications have value semantics, so they are not affected.
Could we just ban non-value types from containers?
There is one last question left. Who in the world uses std::tuple<int &> as a vector’s value type?
While that may sound very unlikely, in reality there are many other types for which assignment is different than destruction+move construction.
For instance: allocator-aware types (std::string, std::vector, etc.) may exhibit non-value behavior with certain allocators. An example of such an allocator is std::pmr::polymorphic_allocator, which does not propagate on move assignment. Erasing a std::pmr::string from a vector must have a very specific outcome, dictated by its reference semantics; but surely we can’t outlaw vector of strings!
Even if we could decide to ban “some” types, how do we figure out their assignment semantics?. A Rule Of Five type may or may not have value semantics, and there is no way to know about that “from the outside”.
So, this line of reasoning leads to nowhere.
Conclusions
In this post we started with a list of questions. In search for answers, we have have seen that:
- for some types (“value” types), move assignment is equivalent to destruction of the target followed by move construction from the source, while for other types this is not true
- we can’t just ban the latter types from being stored in containers. They’re useful, and in general we have no means to detect them!
- vector erasure can be implemented in terms of relocations: this fact potentially allows a vector implementation (like QVector) to use trivial relocation when erasing elements
- Qt containers assume value semantics from the objects they hold, and do not document the exact strategies used by their algorithms
- we can’t change std::vector erasure behavior, because it would break existing code.
In the next post we will finish our exploration of trivial relocability, and we’ll look at C++ standardization. Thank you for reading!
Overview about all installments:
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.