Qt 3D: One too many threads or what has changed in 5.14
Qt 3D makes heavy use of threads, as a way to spread work across CPU cores and maximize throughput, but also to minimize the chances of blocking the main thread. Though nice on paper, the last case eventually leads to added complexity. Sometimes, there are just one too many threads.
In the past, we’ve been guilty of trying to do too much within Qt 3D rather than assuming that some things are the developer’s duty. For instance there was a point in time where we’d compare the raw content of textures internally. The reason behind that was to handle cases where users would load the same textures several times rather than sharing one. This led to code that was hard to maintain and easy to break. Ultimately it provided convenience only for what can be seen as a misuse of Qt 3D, which was not the the original intention.
We had similar systems in place for Geometries, Shaders… Part of the reason why we made such choices at the time was that the border between what Qt 3D should or shouldn’t be doing was really blurry. Over time we’ve realized that Qt 3D is lower level than what you’d do with QtQuick. A layer on top of Qt 3D would have instead been the right place to do such things. We’ve solved some of these pain points by starting work on Kuesa which provides assets collections.
In the past couple of Qt releases we’ve been trying to simplify the code where it had become overly complex for debatable convenience. For Qt 5.14 we have decided to rework the threading architecture.
Before we dive into what has changed and what we gain from it, let’s first go over the architecture that was in place until 5.13.
Qt 3D Threading Architecture as of 5.13
The Main Thread
This is the Qt application thread where the Qt 3D Entity/Component scene tree lives.
The Aspect Thread
This is the thread in which Aspects live. Each Aspect is responsible for maintaining an internal backend tree that reflects the frontend tree. A messaging mechanism has been set up in order to maintain frontend and backend trees in sync. Most of these messages contain a property name as a string and a value as a variant.
The ThreadPool
Each frame, based on changes and content of its backend tree, and each aspect will get a chance to schedule work that needs to be performed:
- The Render aspect will for instance launch jobs that prepare the content for rendering, load buffers, load textures …
- The Input aspect will look for pending input events and send messages to the frontend tree to notify it about any state change
The RenderThread
Pure Qt 3D Case
This is the thread tasked to submit the rendering commands that have been created previously by the render aspect’s jobs. The idea is that submitting these commands in a dedicated thread allows to unlock the Aspect and Main threads so that we can prepare content for frame n + 1 while rendering frame n.
One thing to be noted however, this thread is only available when using “pure” Qt 3D, in other words when using Qt 3D without QtQuick and Scene3D.
Scene3D Case
When using Scene3D, we instead rely on the SceneGraph Thread (which can potentially be the same as the main thread) to ask for the rendering commands to be submitted.
Since the Scene Graph thread is outside of our control, when it asks for Qt 3D to submit commands it could be that the Render aspect jobs in charge of preparing said commands have yet to be completed. To handle that case, we would return early in the renderer and expect that when we’re called again in the future, jobs would finally have completed.
This means that potentially Qt Quick and Qt 3D would not be rendering at the same refresh rate. This can be seen as good or bad depending on your use case:
If 3D content is just there but not critical, then having Qt 3D not block Qt Quick can be interesting for you. If on the other hand you want Qt Quick and Qt 3D to be synched, this was until recently not possible.
The Simulation Loop
The loop behind Qt 3D that drives aspects, rendering and synchronization:
- We wait for the Renderer to be ready for the next frame
- Synchronize the aspect’s backend trees by distributing change messages from the frontend tree stored in the ChangeArbiter
- Ask each aspect to launch jobs that need to be performed for the current frame
- One job from the RenderAspect will notify the RenderThread that all RenderCommands have been prepared
- We wait for all jobs to be completed
- We wait for the next frame
- Once RenderThread has all that’s required for rendering, it calls proceedToTheNextFrame and does the graphics submission
- This allows to start the next loop of the simulation loop while we are still rendering.
What has changed in 5.14?
As you can see, aspects are living in their own thread. The main reason for this is to allow Aspect to launch jobs and communicate while the main thread is blocked. This unfortunately forces to use string based messages to synchronize the trees. It also makes synching between threads to handle all cases quite difficult.
What have we done for Qt 5.14? Simply put, we’ve removed the Aspect Thread.
Why?
Well, what’s the gain of having it? Even if the Aspect thread can spin freely while the main thread is blocked do we really benefit from this behavior?
From experience, in 90% of cases, we use Qt 3D with Scene3D. Which means that if the main thread were to be locked, Qt Quick wouldn’t sync and Qt 3D wouldn’t render anything.
So arguably, we have a thread that’s making things more complex to handle a case that very few of us might benefit from. This benefit couldn’t offset the difficulties that resulted from it:
- Slow synching mechanism
- Very hard to control Qt 3D from the outside
- Lots of optimizations left on the table because it adds complexity to lots of areas (backend trees, string based messages, blocking threads to sync, hard to know what has changed…)
What do we gain from that?
This now means Aspects and the simulation loop are performed in the main thread.
Gains from that are:
- We don’t need to use messages to sync, we can just compare against the frontend nodes directly:
- This removes lots of allocations, string comparisons
- QVariant comparison was a huge performance hit because of the way it handles multithreading
- Technically we could now go as far as not having a copy of the tree in the aspects
- This should make caching commands a lot easier.
- This allows to have a greater control over the simulation loop:
- We can now decide whether Qt 3D should be in the driver seat or not
- Control can now be manual, done by the user
- Makes integrating Qt 3D with 3rd party engines a lot easier
- Scene3D integration uses that approach.
- Qt Quick and Qt 3D now run in sync at the same refresh rate
- The downside is that if Qt 3D content is slow to render, Qt Quick will be impacted.
- Should allow us to introduce more sync stages to avoid having to wait more than 1 frames to react to things like inputs, capture request…
- Greatly simplifies startup and shutdown sequences.
What’s next?
In a follow up article, we will see how Qt 5.14 was also modified to change the way the scene state get sync’ed between frontend and backend nodes, providing significant performance gains on very dynamic scenes.