Mozilla is the maker of the famous Firefox web browser and the birthplace of the likes of Rust and Servo (read more about Embedding the Servo Web Engine in Qt).
Firefox is a huge, multi-platform, multi-language project with 21 million lines of code back in 2020, according to their own blog post. Navigating in projects like those is always a challenge, especially at the cross-language boundaries and in platform-specific code.
To improve working with the Firefox code-base, Mozilla hosts an online code browser tailored for Firefox called Searchfox. Searchfox analyzes C++, JavaScript, various IDLs (interface definition languages), and Rust source code and makes them all browsable from a single interface with full-text search, semantic search, code navigation, test coverage report, and git blame support. It's the combination of a number of projects working together, both internal to Mozilla (like their Clang plugin for C++ analysis) and external (such as rust-analyzer maintained by Ferrous Systems).
It takes a whole repository in and separately indexes C++, Rust, JavaScript and now Java and Kotlin source code. All those analyses are then merged together across platforms, before running a cross-reference step and building the final index used by the web front-end available at searchfox.org.
Mozilla merged the Firefox for Android source code into the main mozilla-central repository that Searchfox indexes. To add support for that new Java and Kotlin code to Searchfox, we reused open-source tooling built by Sourcegraph around the SemanticDB and SCIP code indexing formats. (Many thanks to them!)
Sourcegraph's semanticdb-javac and semanticdb-kotlinc compiler plugins are integrated into Firefox's CI system to export SemanticDB artifacts. The Searchfox indexer fetches those SemanticDB files and turns them into a SCIP index, using scip-semanticdb. That SCIP index is then consumed by the existing Searchfox-internal scip-indexer tool.
In the process, a couple of upstream contributions were made to rust-analyzer (which also emits SCIP data) and scip-semanticdb.
GeckoView is an Android wrapper around Gecko, the Firefox web engine. It extensively uses cross-language calls between Java and C++.
Searchfox already had support for cross-language interfaces, thanks to its IDL support. We built on top of that to support direct cross-language calls between Java and C++.
First, we identified the different ways the C++ and Java code interact and call each other. There are three ways Java methods marked with the native keyword call into C++:
Case A1: By default, the JVM will search for a matching C function to call based on its name. For instance, calling org.mozilla.gecko.mozglue.GeckoLoader.nativeRun from Java will call Java_org_mozilla_gecko_mozglue_GeckoLoader_nativeRun on the C++ side.
Case A2: This behavior can be overridden at runtime by calling the JNIEnv::RegisterNatives function on the C++ side to point at another function.
Case A3: GeckoView has a code generator that looks for Java items decorated with the @WrapForJNI and native annotations and generates a C++ class template meant to be used through the Curiously Recurring Template Pattern. This template provides an Init static member function that does the right JNIEnv::RegisterNatives calls to bind the Java methods to the implementing C++ class's member functions.
We also identified two ways the C++ code calls Java methods:
Case B1: directly with JNIEnv::Call… functions.
Case B2: GeckoView's code generator also looks for Java methods marked with @WrapForJNI (without the native keyword this time) and generates a C++ wrapper class and member functions with the right JNIEnv::Call… calls.
Only the C++ side has the complete view of the bindings; so that's where we decided to extract the information from, by extending Mozilla's existing Clang plugin.
First, we defined custom C++ annotationsbound_as and binding_to that the clang plugin transforms into the right format for the cross-reference analysis. This means we can manually set the binding information:
class__attribute__((annotate("binding_to","jvm","class","S_jvm_sample/Jni#"))) CallingJavaFromCpp
{__attribute__((annotate("binding_to","jvm","method","S_jvm_sample/Jni#javaStaticMethod().")))staticvoidjavaStaticMethod(){// Wrapper code}__attribute__((annotate("binding_to","jvm","method","S_jvm_sample/Jni#javaMethod().")))voidjavaMethod(){// Wrapper code}__attribute__((annotate("binding_to","jvm","getter","S_jvm_sample/Jni#javaField.")))intjavaField(){// Wrapper codereturn0;}__attribute__((annotate("binding_to","jvm","setter","S_jvm_sample/Jni#javaField.")))voidjavaField(int){// Wrapper code}__attribute__((annotate("binding_to","jvm","const","S_jvm_sample/Jni#javaConst.")))staticconstexprint javaConst =5;};class__attribute__((annotate("bound_as","jvm","class","S_jvm_sample/Jni#"))) CallingCppFromJava
{__attribute__((annotate("bound_as","jvm","method","S_jvm_sample/Jni#nativeStaticMethod().")))staticvoidnativeStaticMethod(){// Real code}__attribute__((annotate("bound_as","jvm","method","S_jvm_sample/Jni#nativeMethod().")))voidnativeMethod(){// Real code}};
(This example is, in fact, extracted from our test suite, jni.cpp vs Jni.java.)
Then, we wrote some heuristics that try and identify cases A1 (C functions named Java_…), A3 and B2 (C++ code generated from @WrapForJNI decorators) and automatically generate these annotations. Cases A2 and B1 (manually calling JNIEnv::RegisterNatives or JNIEnv::Call… functions) are rare enough in the Firefox code base and impossible to reliably recognize; so it was decided not to cover them at the time. Developers who wish to declare such bindings could manually annotate them.
After this point, we used Searchfox's existing analysis JSON format and mostly re-used what was already available from IDL support. When triggering the context menu for a binding wrapper or bound function, the definitions in both languages are made available, with “Go to” actions that jump over the generally irrelevant binding internals.
The search results also display both sides of the bridge, for instance:
Aside from this Java/Kotlin-related work, we also added support for displaying and interacting with macro expansions. This was inspired by KDAB's own codebrowser.dev, but improves it to:
Display all expansion variants, if they differ across platforms or by definition:
Per-platform expansions
Per-definition expansions
Make macros fully indexed and interactive:
In-macro context menu
This work mainly happened in the Mozsearch Clang plugin to extract macro expansions during the pre-processing stage and index them with the rest of the top-level code.
Again, if you want more details, the feature request is available on Bugzilla and the implementation and further technical discussion is on GitHub.
Summary
Because of the many technologies it makes use of, from compiler plugins and code analyzers written in many languages, to a web front-end written using the usual HTML/CSS/JS, by way of custom tooling and scripts in Rust, Python and Bash, Searchfox is a small but complex and really interesting project to work on. KDAB successfully added Java/Kotlin code indexing, including analyzing their C++ bindings, and are starting to improve Searchfox's C++ support itself, first with fully-indexed macro expansions and next with improved templates support.
The KDAB Group is a globally recognized provider for software consulting, development and training, specializing in embedded devices and complex cross-platform desktop applications. In addition to being leading experts in Qt, C++ and 3D technologies for over two decades, KDAB provides deep expertise across the stack, including Linux, Rust and modern UI frameworks. With 100+ employees from 20 countries and offices in Sweden, Germany, USA, France and UK, we serve clients around the world.
Our hands-on Modern C++ training courses are designed to quickly familiarize newcomers with the language. They also update professional C++ developers on the latest changes in the language and standard library introduced in recent C++ editions.