Debugging and Profiling Qt 3D applications Learn about new 5.15 features and other useful tools
Qt 3D, being a retained mode high level graphic API abstraction, tries to hide most of the details involved in rendering the data provided by applications. It makes a lot of decisions and operations in the background in order to get pixels on the screen. But, because Qt 3D also has very rich API, developers can have a lot of control on the rendering by manipulating the scene graph and, more importantly, the frame graph. It is however sometimes difficult to understand how various operations affect performance.
In this article, we look at some of the tools, both old and new, that can be used to investigate what Qt 3D is doing in the back end and get some insight into what is going on during the frame.
Built in Profiling
The first step in handling performance issues is, of course, measuring where time is spent. This can be as simple as measuring how long it took to render a given frame. But to make sense of these numbers, it helps to have a notion of how complex the scene is.
In order to provide measurable information, Qt 3D introduces a visual overlay that will render details of the scene, constantly updated in real time.
The overlay shows some real time data:
- Time to render last frame and FPS (frames per second), averaged and plotted over last few seconds. As Qt 3D is by default locking to VSync, this should not exceed 60fps on most configurations.
- Number of Jobs: these are the tasks that Qt 3D executes on every frame. The number of jobs may vary depending on changes in the scene graph, whether animations are active, etc.
- Number of Render Views: this matches loosely to render pass, see below discussion on the frame graph.
- Number of Commands: this is total number of draw calls (and compute calls) in the frame.
- Number of Vertices and Primitives (triangles, lines and points combined).
- Number of Entities, Geometries and Textures in the scene graph. For the last two, the overlay will also show the number of geometries and textures that are effectively in use in the frame.
As seen in the screen shots above, the scene graph contains two entities, each with one geometry. This will produce two draw calls when both objects are in frame. But as the sphere rotates out of the screen, you can see the effect of the view frustum culling job which is making sure the sphere doesn’t get rendered, leaving a single draw call for the torus.
This overlay can be enabled by setting the showDebugOverlay property of the QForwardRenderer to true.
Understanding Rendering Steps
To make sense of the numbers above, it helps to understand the details of the scene graph and frame graph.
In the simple case, as in the screen shots, an entity will have a geometry (and material, maybe a transform). But many entities may share the same geometry (a good thing if appropriate!). Also, entities may not have any geometry but just be used for grouping and positioning purposes.
So keeping an eye on the number of entities and geometries, and seeing how that effects the number of commands (or draw calls), is valuable. If you find one geometry drawn one thousand times in a thousand separate entities, if may be a good indication that you should refactor your scene to use instanced rendering.
In order to provide more details, the overlay has a number of buttons that can be used to dump the current state of the rendering data.
For a deeper understanding of this, you might consider our full Qt 3D Training course.
Scene Graph
Dumping the scene graph will print data to the console, like this:
Qt3DCore::Quick::Quick3DEntity{1} [ Qt3DRender::QRenderSettings{2}, Qt3DInput::QInputSettings{12} ]
Qt3DRender::QCamera{13} [ Qt3DRender::QCameraLens{14}, Qt3DCore::QTransform{15} ]
Qt3DExtras::QOrbitCameraController{16} [ Qt3DLogic::QFrameAction{47}, Qt3DInput::QLogicalDevice{46} ]
Qt3DCore::Quick::Quick3DEntity{75} [ Qt3DExtras::QTorusMesh{65}, Qt3DExtras::QPhongMaterial{48},
Qt3DCore::QTransform{74} ]
Qt3DCore::Quick::Quick3DEntity{86} [ Qt3DExtras::QSphereMesh{76}, Qt3DExtras::QPhongMaterial{48},
Qt3DCore::QTransform_QML_0{85} ]
This prints the hierarchy of entities and for each of them lists all the components. The id (in curly brackets) can be used to identify shared components.
Frame Graph
Similar data can be dumped to the console to show the active frame graph:
Qt3DExtras::QForwardRenderer
Qt3DRender::QRenderSurfaceSelector
Qt3DRender::QViewport
Qt3DRender::QCameraSelector
Qt3DRender::QClearBuffers
Qt3DRender::QFrustumCulling
Qt3DRender::QDebugOverlay
This is the default forward renderer frame graph that comes with Qt 3D Extras.
As you can see, one of the nodes in that graph is of type QDebugOverlay. If you build your own frame graph, you can use an instance of that node to control which surface the overlay will be rendered onto. Only one branch of the frame graph may contain a debug node. If the node is enabled, then the overlay will be rendered for that branch.
The frame graph above is one of the simplest you can build. They may get more complicated as you build effects into your rendering. Here’s an example of a Kuesa frame graph:
Kuesa::PostFXListExtension
Qt3DRender::QViewport
Qt3DRender::QClearBuffers
Qt3DRender::QNoDraw
Qt3DRender::QFrameGraphNode (KuesaMainScene)
Qt3DRender::QLayerFilter
Qt3DRender::QRenderTargetSelector
Qt3DRender::QClearBuffers
Qt3DRender::QNoDraw
Qt3DRender::QCameraSelector
Qt3DRender::QFrustumCulling
Qt3DRender::QTechniqueFilter
Kuesa::OpaqueRenderStage (KuesaOpaqueRenderStage)
Qt3DRender::QRenderStateSet
Qt3DRender::QSortPolicy
Qt3DRender::QTechniqueFilter
Kuesa::OpaqueRenderStage (KuesaOpaqueRenderStage)
Qt3DRender::QRenderStateSet
Qt3DRender::QSortPolicy
Qt3DRender::QFrustumCulling
Qt3DRender::QTechniqueFilter
Kuesa::TransparentRenderStage (KuesaTransparentRenderStage)
Qt3DRender::QRenderStateSet
Qt3DRender::QSortPolicy
Qt3DRender::QTechniqueFilter
Kuesa::TransparentRenderStage (KuesaTransparentRenderStage)
Qt3DRender::QRenderStateSet
Qt3DRender::QSortPolicy
Qt3DRender::QBlitFramebuffer
Qt3DRender::QNoDraw
Qt3DRender::QFrameGraphNode (KuesaPostProcessingEffects)
Qt3DRender::QDebugOverlay
Qt3DRender::QRenderStateSet (ToneMappingAndGammaCorrectionEffect)
Qt3DRender::QLayerFilter
Qt3DRender::QRenderPassFilter
If you are not familiar with the frame graph, it is important to understand that each path (from root to leaf) will represent a render pass. So the simple forward renderer will represent a simple render pass, but the Kuesa frame graph above contains eight passes!
It is therefore often easier to look at the frame graph in term of those paths. This can also be dumped to the console:
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QClearBuffers, Qt3DRender::QNoDraw ]
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QFrameGraphNode (KuesaMainScene),
Qt3DRender::QLayerFilter, Qt3DRender::QRenderTargetSelector, Qt3DRender::QClearBuffers, Qt3DRender::QNoDraw ]
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QFrameGraphNode (KuesaMainScene),
Qt3DRender::QLayerFilter, Qt3DRender::QRenderTargetSelector, Qt3DRender::QCameraSelector, Qt3DRender::QFrustumCulling,
Qt3DRender::QTechniqueFilter, Kuesa::OpaqueRenderStage (KuesaOpaqueRenderStage), Qt3DRender::QRenderStateSet,
Qt3DRender::QSortPolicy ]
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QFrameGraphNode (KuesaMainScene),
Qt3DRender::QLayerFilter, Qt3DRender::QRenderTargetSelector, Qt3DRender::QCameraSelector, Qt3DRender::QTechniqueFilter,
Kuesa::OpaqueRenderStage (KuesaOpaqueRenderStage), Qt3DRender::QRenderStateSet, Qt3DRender::QSortPolicy ]
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QFrameGraphNode (KuesaMainScene),
Qt3DRender::QLayerFilter, Qt3DRender::QRenderTargetSelector, Qt3DRender::QCameraSelector, Qt3DRender::QFrustumCulling,
Qt3DRender::QTechniqueFilter, Kuesa::TransparentRenderStage (KuesaTransparentRenderStage), Qt3DRender::QRenderStateSet,
Qt3DRender::QSortPolicy ]
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QFrameGraphNode (KuesaMainScene),
Qt3DRender::QLayerFilter, Qt3DRender::QRenderTargetSelector, Qt3DRender::QCameraSelector, Qt3DRender::QTechniqueFilter,
Kuesa::TransparentRenderStage (KuesaTransparentRenderStage), Qt3DRender::QRenderStateSet, Qt3DRender::QSortPolicy ]
[ Kuesa::PostFXListExtension, Qt3DRender::QViewport, Qt3DRender::QFrameGraphNode (KuesaMainScene),
Qt3DRender::QLayerFilter, Qt3DRender::QRenderTargetSelector, Qt3DRender::QBlitFramebuffer, Qt3DRender::QNoDraw ]
Hopefully this is a good way of finding out issues you may have when building your custom frame graph.
Draw Commands
On every pass of the frame graph, Qt 3D will traverse the scene graph, find entities that need to be rendered, and for each of them, issue a draw call. The number of objects drawn in each pass may vary, depending on whether the entities and all of their components are enabled or not, or whether entities get filtered out by using QLayers (different passes may draw different portions of the scene graph).
The new profiling overlay also gives you access to the actual draw calls.
So in this simple example, you can see that two draw calls are made, both for indexed triangles. You can also see some details about the render target, such as the viewport, the surface size, etc.
That information can also be dumped to the console which makes it easier to search in a text editor.
Built in Job Tracing
The data above provides a useful real time view on what is actually being processed to render a particular frame. However, it doesn’t provide much feedback as to how long certain operations take and how that changes during the runtime of the application.
In order to track such information, you need to enable tracing.
Tracing tracks, for each frame, what jobs are executed by Qt 3D’s backend. Jobs involve updating global transformations and the bounding volume hierarchy, finding objects in the view frustum, layer filtering, picking, input handling, animating, etc. Some jobs run every frame, some only run when internal state needs updating.
If your application is slow, it may be because jobs are taking a lot of time to complete. But how do you find out which jobs take up all the time?
Qt 3D has had tracing built in since a few years already, but it was hard to get to. You needed to do your own build of Qt 3D and enable tracing when running qmake. From thereon, every single run of an application linked against that build of Qt 3D would generate a trace file.
In 5.15, tracing is always available. It can be enabled in two ways:
- By setting the QT3D_TRACE_ENABLED environment variable before the application starts (or at least before the aspect engine is created). This means the tracing will happen for the entire run of the application.
- If you’re interested in tracing for a specific part of your application’s life time, you can enable the overlay and toggle tracing on and off using the check for Jobs. In this case, a new trace file will be generated every time the tracing is enabled.
For every tracing session, Qt 3D will generate one file in the current working directory. So how do you inspect the content of that file?
KDAB provides a visualisation tool but it is not currently shipped with Qt 3D. You can get the source and build it from GitHub here. Because jobs change from one version of Qt 3D to the next, you need to take care to configure which version was used to generate the trace files. Using that tool, you can open the trace files. It will render a time line of all the jobs that were executed for every frame.
In the example above, you can see roughly two frames worth of data, with jobs executed on a thread pool. You can see the longer running jobs, in this case:
- RenderViewBuilder jobs, which create all the render views, one for each branch in the frame graph. You can see some of them take much longer that others.
- FrameSubmissionPart1 and FrameSubmissionPart2 which contain the actual draw calls.
Of course, you need to spend some time understanding what Qt 3D is doing internally to make sense of that data. As with most performance monitoring tools, it’s worth spending the time experimenting with this and seeing what gets affected by changes you make to your scene graph or frame graph.
Job Dependencies
Another important source of information when analysing performance of jobs is looking at the dependencies. This is mostly useful for developers of Qt 3D aspects.
Using the profiling overlay, you can now dump the dependency graph in GraphViz dot format.
Other Tools
Static capabilities
Qt 3D 5.15 introduces QRenderCapabilities which can be used to make runtime decisions based on the actual capabilities of the hardware the application is running on. The class supports a number of properties which report information such as the graphics API in use, the card vendor, the supported versions of OpenGL and GLSL. It also has information related to the maximum number of samples for MSAA, maximum texture size, if UBOs and SSBOs are supported and what their maximum size is, etc.
Third Party Tools
Of course, using more generic performance tools is also a good idea.
perf can be used for general tracing, giving you insight where time is spent, both for Qt 3D and for the rest of your application. Use it in combination with KDAB’s very own hotspot to get powerful visualisation of the critical paths in the code.
Using the flame graph, as show above (captured on an embedded board), you can usually spot the two main sections of Qt 3D work, the job processing and the actual rendering.
Other useful tools are the OpenGL trace capture applications, either the generic ones such as apitrace and renderdoc, or the ones provided your hardware manufacturer, such as nVidia or AMD.
Conclusion
We hope this article will help you get more performance out of your Qt 3D applications. The tools, old and new, should be very valuable to help find bottlenecks and see the impact of changes you make to your scene graph or frame graph. Furthermore, improvements regarding performance are in the works for Qt 6, so watch this space!
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
Hi, how can I enable the enviroment variable QT3D_TRACE_ENABLED in my proyect?
It’s just an environment variable. Set it on the command line, set it in Creator’s run time configuration for your project, or set it in code using qputenv before creating the first window.
Hi @Mike Krus,
I’m developing a 3D application on QNX device using Qt 5.12. As I read your post, all your solutions are available from Qt5.15. Is there any tool that supports profiling my app which is being developed using Qt 5.12?
Thank you!
Depends what you’re looking for. For GL calls you can use apitrace, the platform/vendor tools, etc. For tracking what Qt3D is doing you can build it to enable job traces (there’s a configure option for that) and use the same job tracker. But that doesn’t have all the information that the debug overlay provides.
I would recommend upgrading to 5.15 anyway, its performance is much better.
Seems to be i am the stupid guy.
How to set QT3D_TRACE_ENABLED?
I tried it in different ways and no output.
when you build qt3d, you need to pass extra options to qmake. Something like: “qmake <normal_options> — –qt3d-profile-jobs=yes”
(don’t have a 5.12 build of qt available to test it out)