Trusted Software Excellence across Desktop and Embedded
Take a glance at the areas of expertise where KDAB excels ranging from swift troubleshooting, ongoing consulting and training to multi-year, large-scale software development projects.
Find out why customers from innovative industries rely on our extensive expertise, including Medical, Biotech, Science, Renewable Energy, Transportation, Mobility, Aviation, Automation, Electronics, Agriculture and Defense.
High-quality Embedded Engineering across the Stack
To successfully develop an embedded device that meets your expectations regarding quality, budget and time to market, all parts of the project need to fit perfectly together.
Learn more about KDAB's expertise in embedded software development.
Where the capabilities of modern mobile devices or web browsers fall short, KDAB engineers help you expertly architect and build high-functioning desktop and workstation applications.
Extensible, Safety-compliant Software for the Medical Sector
Create intelligent, patient-focused medical software and devices and stay ahead with technology that adapts to your needs.
KDAB offers you expertise in developing a broad spectrum of clinical and home-healthcare devices, including but not limited to, internal imaging systems, robotic surgery devices, ventilators and non-invasive monitoring systems.
Building digital dashboards and cockpits with fluid animations and gesture-controlled touchscreens is a big challenge.
In over two decades of developing intricate UI solutions for cars, trucks, tractors, scooters, ships, airplanes and more, the KDAB team has gained market leading expertise in this realm.
Build on Advanced Expertise when creating Modern UIs
KDAB assists you in the creation of user-friendly interfaces designed specifically for industrial process control, manufacturing, and fabrication.
Our specialties encompass the custom design and development of HMIs, enabling product accessibility from embedded systems, remote desktops, and mobile devices on the move.
Legacy software is a growing but often ignored problem across all industries. KDAB helps you elevate your aging code base to meet the dynamic needs of the future.
Whether you want to migrate from an old to a modern GUI toolkit, update to a more recent version, or modernize your code base, you can rely on over 25 years of modernization experience.
KDAB offers a wide range of services to address your software needs including consulting, development, workshops and training tailored to your requirements.
Our expertise spans cross-platform desktop, embedded and 3D application development, using the proven technologies for the job.
When working with KDAB, the first-ever Qt consultancy, you benefit from a deep understanding of Qt internals, that allows us to provide effective solutions, irrespective of the depth or scale of your Qt project.
Qt Services include developing applications, building runtimes, mixing native and web technologies, solving performance issues, and porting problems.
KDAB helps create commercial, scientific or industrial desktop applications from scratch, or update its code or framework to benefit from modern features.
Discover clean, efficient solutions that precisely meet your requirements.
Boost your team's programming skills with in-depth, constantly updated, hands-on training courses delivered by active software engineers who love to teach and share their knowledge.
Our courses cover Modern C++, Qt/QML, Rust, 3D programming, Debugging, Profiling and more.
The collective expertise of KDAB's engineering team is at your disposal to help you choose the software stack for your project or master domain-specific challenges.
Our particular focus is on software technologies you use for cross-platform applications or for embedded devices.
Since 1999, KDAB has been the largest independent Qt consultancy worldwide and today is a Qt Platinum partner. Our experts can help you with any aspect of software development with Qt and QML.
KDAB specializes in Modern C++ development, with a focus on desktop applications, GUI, embedded software, and operating systems.
Our experts are industry-recognized contributors and trainers, leveraging C++'s power and relevance across these domains to deliver high-quality software solutions.
KDAB can guide you incorporating Rust into your project, from as overlapping element to your existing C++ codebase to a complete replacement of your legacy code.
Unique Expertise for Desktop and Embedded Platforms
Whether you are using Linux, Windows, MacOS, Android, iOS or real-time OS, KDAB helps you create performance optimized applications on your preferred platform.
If you are planning to create projects with Slint, a lightweight alternative to standard GUI frameworks especially on low-end hardware, you can rely on the expertise of KDAB being one of the earliest adopters and official service partner of Slint.
KDAB has deep expertise in embedded systems, which coupled with Flutter proficiency, allows us to provide comprehensive support throughout the software development lifecycle.
Our engineers are constantly contributing to the Flutter ecosystem, for example by developing flutter-pi, one of the most used embedders.
KDAB invests significant time in exploring new software technologies to maintain its position as software authority. Benefit from this research and incorporate it eventually into your own project.
Start here to browse infos on the KDAB website(s) and take advantage of useful developer resources like blogs, publications and videos about Qt, C++, Rust, 3D technologies like OpenGL and Vulkan, the KDAB developer tools and more.
The KDAB Youtube channel has become a go-to source for developers looking for high-quality tutorial and information material around software development with Qt/QML, C++, Rust and other technologies.
Click to navigate the all KDAB videos directly on this website.
In over 25 years KDAB has served hundreds of customers from various industries, many of them having become long-term customers who value our unique expertise and dedication.
Learn more about KDAB as a company, understand why we are considered a trusted partner by many and explore project examples in which we have proven to be the right supplier.
The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications.
Read more about the history, the values, the team and the founder of the company.
When working with KDAB you can expect quality software and the desired business outcomes thanks to decades of experience gathered in hundreds of projects of different sizes in various industries.
Have a look at selected examples where KDAB has helped customers to succeed with their projects.
KDAB is committed to developing high-quality and high-performance software, and helping other developers deliver to the same high standards.
We create software with pride to improve your engineering and your business, making your products more resilient and maintainable with better performance.
KDAB has been the first certified Qt consulting and software development company in the world, and continues to deliver quality processes that meet or exceed the highest expectations.
In KDAB we value practical software development experience and skills higher than academic degrees. We strive to ensure equal treatment of all our employees regardless of age, ethnicity, gender, sexual orientation, nationality.
Interested? Read more about working at KDAB and how to apply for a job in software engineering or business administration.
Full stack tracing is a tool that should be part of every software engineer’s toolkit. It's the best way to investigate and solve certain classes of hard problems in optimization and debugging. Because of the power and capability it gives the developer, we’ll be writing a series of blogs about it: when to use it, how to get it set up, how to create traces, and how to interpret results. Our goal is to get you capable enough to use full stack tracing to solve your tough problems too.
Firstly, what is it? Full stack tracing is tracing on the full software stack, from the operating system to the application. By collecting profiling information (timing, process, caller, API, and other info) from the kernel, drivers, software frameworks, application, and JavaScript environments, you’re able to see exactly how the individual components of a system are interacting. That opens up areas of investigation that are impossible to achieve with standard application profilers, kernel debug messages, or even strategically inserted printf() commands. One way to think of full stack tracing is like a developer’s MRI machine that allows you to look into a running system without disturbing it to determine what is happening inside. (And unlike other low-level traces that we've written about before, full stack tracing provides a simpler way to view activity up and down the entire software stack.)
Use cases
When do you need full stack tracing? Since it gives you visibility across a running system, it’s a great solution anytime you need to do things like:
Gain visibility into the software running below the level of your application and API calls
Measure timing and synchronization of events that happen across multiple applications
Optimize application startup time, locating startup script inefficiencies, and trimming down system boot time
Verify communication between a client and server process, or any other process pairs like publisher and subscriber, or application and watchdog
Detect background processes that interfere with an application
Pinpoint places in the system that consume excessive CPU cycles
Find problematic interactions between multiple device drivers or independently running software stacks
Correctly understand the behavior of a complex system before you make bug fixes or alterations
Battling slow boot time
To put all of this in context, let’s walk through an example where we used full stack tracing to solve a show-stopping problem for a client. In this case, it was an automotive customer creating an infotainment system with a rather complex software stack. The main application was created in Qt and ran on a Nvidia Tegra X1, which is an extremely powerful processor for an embedded system. The GPU rivals that of laptops with a quad core CPU. Because the software required integration from many partners, developers unfortunately realized very late during integration that despite all of the power, the infotainment system was taking over five seconds to show the application’s first screen. This was completely unacceptable, and we were asked to diagnose why it was taking so long.
As much of the application was in QML, we first took a look at the code. While it certainly wasn’t crafted for speed – and they probably used far too much JavaScript code – this didn’t seem like the primary problem. We needed to determine if the boot issues were due to an overall sluggishness with too many pieces of software running, or if we could pinpoint the problems to one (or several) specific areas. Firing up the QML profiler highlighted something we didn’t expect: a trivial binding evaluation of a string to an image file in early startup took 180 milliseconds. What was happening underneath it that caused that binding to take so long?
QML profiler showing long binding in action (similar issue, different app)
That’s a question that the QML profiler can’t answer. While you can make many educated guesses to figure out exactly what’s happening, it is this type of problem where full stack tracing is invaluable. It can tell you what Qt, C++, kernel, driver, or hardware interactions are happening underneath the QML so you can see where the time is being spent.
By turning on full stack tracing for the kernel, syscalls, and C++, adding tracepoints within Qt, and rerunning the application, we were able to find that those long bindings were occurring because many libraries were being loaded for image plug-in formats. Since the application only used PNG files, we were able to eliminate a huge number of these unnecessary loads for other file plug-ins. That included the SVG plugin, which was especially heavy. Although the application was a Qt Quick app that didn’t use Qt Widgets, the SVG plugin was pulling in the relatively big Qt Widgets library, which was unneeded. (This is because the plugin was compiled with QtWidgets support enabled.) But while eliminating the plug-ins made a dramatic improvement, it still wasn’t fast enough, and it was taking far longer to boot than it should.
Again, full stack tracing was able to show us that the disk I/O was a huge culprit and many applications were hammering the disk during startup. However, most of those applications accessing the disk were background tasks that weren’t critical to the startup procedure. Furthermore, most of the disk I/O was coming through a single call within syslog: the processes were emitting reams of debug output, going through syslog to write the data out to the flash drive. The CPU and GPU had more than enough horsepower to manage this, but the on-board flash was really cheap…and pretty slow.
While it might have been better to remove unnecessary logging from those background tasks, that change would have required a lot of coordination and testing because many distinct software suppliers were involved. That would have introduced a good deal of risk to a project that was already perilously close to the production deadline. Instead, we settled for bumping up the main application’s priority. The increased priority let the GUI beat out those background tasks during boot up. The background tasks were able to harmlessly finish their work slightly later. This change was now able to bring the splash screen up within one second, meeting the customer requirements to everyone’s great relief.
While full stack tracing was critical to the technical understanding of the problem, it was also necessary for the project management. With the high pressure of the impending deadline, engineers had to convincingly explain that the corrective actions being taken had a high degree of probable success. Full stack tracing indicated precisely what the problems were and exactly where in the software stack they were, avoiding any finger-pointing between suppliers.
QML binding shown with full stack tracing, highlighting SDA activity (similar issue, different app)
Until next time
Hopefully you’ll agree that full stack tracing could be a valuable skill to add to your repertoire. Stay tuned – our next blog will show you how to get it set up.
If you'd like to learn more about Debugging and Profiling, you might be interested in one of our training classes. Visit this page to find out some of the areas and correlating tools with which we are used to working, and with which we can help you.
About KDAB
Trusted software excellence across embedded and desktop platforms
The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.
Milian, thank you for the great post and the many past presentations and blog posts!
Related to this post: Rico Mariani has written Holistic approaches to performance analysis [1] where he describes how to zoom out and investigate performance issues by looking at the whole system. There's a lot be gained from that kind of approach, and I'm glad to see the tooling improve for Qt/Linux, in no small part thanks to KDAB.
Yes, ETW is a godsend and I would really like to have something similar on Linux. The power of that technology is also nicely exemplified by the blogs from Bruce Dawson e.g.
And while I agree that the Linux landscape is improving, it is still a long road ahead, until we have something as powerful as UIforETW and WPA.
Thanks for your comment!
21 - Dec - 2020
Daniel Näslund
Just read Mark Russinovich statement that eBPF-based Sysmon for Linux is coming as preview in February.
Exiting times!
Thank you for your work on tracing and looking forward to more articles!
14 - Mar - 2021
Alois Kraus
It might be of interest to you that you can load LTTNG Traces into WPA with the current WPA Preview which is part of the Windows ADK when you compile the Linux plugins for WPA with the WPA SDK. See https://github.com/microsoft/Microsoft-Performance-Tools-Linux. The plan is to bring the power of WPA to any tracing solution to open WPA for any trace file format including your own ones. This are exciting times. Great work on the LTTNG stuff. Compared to ETW Linux still seems to have a long way to go to capture the relevant events. Normally I need
a) Context Switch Traces with Switch out call stack and ready call stack
b) Profiling Sampling data with 1 kHz with Call Stacks
c) Disk IO duration per file or directory (no stacks)
d) Sampled Memory consumption from all processes (e.g. every 0.5s)
e) Process start/end duration and command line
I am still reading into LTTNG but as far as I understand it can deliver a-e with the exception of b) for which perf is still needed on top. Is this correct?
Do you have a LTTNG Tracer call with this data providers enabled at hand? It would be interesting to see where the gaps are compared to ETW.
14 - Mar - 2021
Milian Wolff
Hey, that's really great news! Any chance to port WPA to Linux, such that we can run it there too? Otherwise it's going to be relatively useless to me I'm afraid ;-) I won't be rebooting into Windows just to look at some performance data.
Regarding your question:
You are right in your assessment - sampling requires the perf subsystem, but there's an experimental LTTNG data source which integrates that. Though personally I'd rather like to see a)-e) implemented in terms of eBPF or perf directly. That should be doable, but we will need a new kind of kernel tracepoint or similar to allow memory consumption sampling.
WPA runs now on .NET Core 3.1. No it won't be ported to Linux natively but you should get it up and running in Wine to some extent. If something is missing you can give feedback and hope that some UI glitches will be solved. I have the book of Brendan Greg (BPF Performance Tools) but to me it just looks like another way to skin the cat.
What is the technical advantage of eBPF over LTTNG and is it really separate? LTTNG could also integrate eBPF to get everything into one place?
So LTTNG gives me Context Switch Traces and related with Stack Traces (lttng add-context --kernel --type callstack-kernel \
--type callstack-user) and I get in Trace Compass Full Stack traces like in WPA? I have tried that but it did not work out. Are there specific tricks necessary to get scheduler events with stack traces?
15 - Mar - 2021
Milian Wolff
Ah cool, I have to try and see if one can get WPA to work under wine. Thanks for the suggestion!
Regarding eBPF vs. LTTNG: The big difference would be that eBPF is upstream in the kernel, whereas LTTNG is a separate module and thus comes with an additional setup burden. Furthermore, at least in principle, eBPF has full access to the existing perf subsystem and already can do stack sampling (albeit only with framepointers, which is not good enough for more user space application setups). That said, you are completely right in saying that LTTNG could be rebased on top of eBPF to some degree.
Regarding callstacks for context switches - no, with LTTNG you won't get any call stacks associated to an event by default. I thought one could add that nowadays somehow too, but fail to find it right now - maybe it's somewhere hidden in the lttng mailing list archives.
Milian Wolff has a long history of creating tools for C++ developers. He’s the main author of Massif-Visualizer, heaptrack, hotspot and ctf2ctf tools now used widely to improve C++ applications performance. He’s a Senior Software Engineer at KDAB where he enjoys solving hard performance problems and teaching developers about debugging and profiling tools. Milian has a Masters Degree in Physics.
Our hands-on Modern C++ training courses are designed to quickly familiarize newcomers with the language. They also update professional C++ developers on the latest changes in the language and standard library introduced in recent C++ editions.
7 Comments
28 - Sept - 2020
Daniel Näslund
Milian, thank you for the great post and the many past presentations and blog posts!
Related to this post: Rico Mariani has written Holistic approaches to performance analysis [1] where he describes how to zoom out and investigate performance issues by looking at the whole system. There's a lot be gained from that kind of approach, and I'm glad to see the tooling improve for Qt/Linux, in no small part thanks to KDAB.
[1] https://docs.microsoft.com/en-us/archive/blogs/ricom/holistic-approaches-to-performance-analysis
28 - Sept - 2020
Milian Wolff
Yes, ETW is a godsend and I would really like to have something similar on Linux. The power of that technology is also nicely exemplified by the blogs from Bruce Dawson e.g.
And while I agree that the Linux landscape is improving, it is still a long road ahead, until we have something as powerful as UIforETW and WPA.
Thanks for your comment!
21 - Dec - 2020
Daniel Näslund
Just read Mark Russinovich statement that eBPF-based Sysmon for Linux is coming as preview in February. Exiting times!
https://twitter.com/markrussinovich/status/1340737856201879552
Thank you for your work on tracing and looking forward to more articles!
14 - Mar - 2021
Alois Kraus
It might be of interest to you that you can load LTTNG Traces into WPA with the current WPA Preview which is part of the Windows ADK when you compile the Linux plugins for WPA with the WPA SDK. See https://github.com/microsoft/Microsoft-Performance-Tools-Linux. The plan is to bring the power of WPA to any tracing solution to open WPA for any trace file format including your own ones. This are exciting times. Great work on the LTTNG stuff. Compared to ETW Linux still seems to have a long way to go to capture the relevant events. Normally I need a) Context Switch Traces with Switch out call stack and ready call stack b) Profiling Sampling data with 1 kHz with Call Stacks c) Disk IO duration per file or directory (no stacks) d) Sampled Memory consumption from all processes (e.g. every 0.5s) e) Process start/end duration and command line
I am still reading into LTTNG but as far as I understand it can deliver a-e with the exception of b) for which perf is still needed on top. Is this correct? Do you have a LTTNG Tracer call with this data providers enabled at hand? It would be interesting to see where the gaps are compared to ETW.
14 - Mar - 2021
Milian Wolff
Hey, that's really great news! Any chance to port WPA to Linux, such that we can run it there too? Otherwise it's going to be relatively useless to me I'm afraid ;-) I won't be rebooting into Windows just to look at some performance data.
Regarding your question: You are right in your assessment - sampling requires the perf subsystem, but there's an experimental LTTNG data source which integrates that. Though personally I'd rather like to see a)-e) implemented in terms of eBPF or perf directly. That should be doable, but we will need a new kind of kernel tracepoint or similar to allow memory consumption sampling.
Regarding the list of useful lttng traces, have a look at part 2 of this series, i.e.: https://www.kdab.com/full-stack-tracing-part-2/
15 - Mar - 2021
Alois Kraus
WPA runs now on .NET Core 3.1. No it won't be ported to Linux natively but you should get it up and running in Wine to some extent. If something is missing you can give feedback and hope that some UI glitches will be solved. I have the book of Brendan Greg (BPF Performance Tools) but to me it just looks like another way to skin the cat.
What is the technical advantage of eBPF over LTTNG and is it really separate? LTTNG could also integrate eBPF to get everything into one place? So LTTNG gives me Context Switch Traces and related with Stack Traces (lttng add-context --kernel --type callstack-kernel \ --type callstack-user) and I get in Trace Compass Full Stack traces like in WPA? I have tried that but it did not work out. Are there specific tricks necessary to get scheduler events with stack traces?
15 - Mar - 2021
Milian Wolff
Ah cool, I have to try and see if one can get WPA to work under wine. Thanks for the suggestion!
Regarding eBPF vs. LTTNG: The big difference would be that eBPF is upstream in the kernel, whereas LTTNG is a separate module and thus comes with an additional setup burden. Furthermore, at least in principle, eBPF has full access to the existing perf subsystem and already can do stack sampling (albeit only with framepointers, which is not good enough for more user space application setups). That said, you are completely right in saying that LTTNG could be rebased on top of eBPF to some degree.
Regarding callstacks for context switches - no, with LTTNG you won't get any call stacks associated to an event by default. I thought one could add that nowadays somehow too, but fail to find it right now - maybe it's somewhere hidden in the lttng mailing list archives.
If you "just" want to investigate off-CPU time, I can recommend hotspot's off-cpu profiling capabilities based on perf instead: https://github.com/KDAB/hotspot#off-cpu-profiling