Optimizing and Sharing Shader Structures
When writing large graphics applications in Vulkan or OpenGL, there’s many data structures that need to be passed from the CPU to the GPU and vice versa. There are subtle differences in alignment, padding and so on between C++ and GLSL to keep track of as well. I’m going to cover a tool I wrote that generates safe and optimal code. This helps not only the GPU but the programmer writing shaders too. Here’s a rundown of the problems I’m trying to solve and how you can implement a similar system in your own programs.
This tool specifically targets and references Vulkan rules, but similar rules exist in OpenGL.
Reasoning
Here’s an example of real code, exposing options to a post-processing stage.
layout(push_constant) uniform PushConstant {
vec4 viewport;
vec4 options;
vec4 transform_ops;
vec4 ao_options;
vec4 ao_options2;
vec4 proj_info;
mat4 cameraProj;
mat4 invProj;
};
Even for the person who wrote this code, it’s hard to tell what each option does from a glance. This is a great way to create bugs, since it’s extremely easy to mix up accessors like ao_options.x and ao_options.y. Ideally, we want these options to be separated but there’s a reason why they’re packed in the first place.
Alignment rules
Say you’re beginning to explore Phong shading, and you want to expose a position and a color property so you can change them while the program is running. In a 3D environment, there are three axes (X, Y and Z) so naturally it must be a vec3. Light color also makes sense to be a vec3. When emitted from a light, it’s color can’t really be “transparent” so we don’t need the alpha channel. The GLSL code so far looks like this:
#version 430
out vec4 finalColor;
layout(binding = 0) buffer block {
vec3 position;
vec3 color;
} light;
void main() {
const vec3 dummy = vec3(1) - light.position;
finalColor = vec4(vec3(1.0, 1.0, 1.0) * light.color, 1.0);
}
(There’s no Phong formula here, we want to make sure the GLSL compiler doesn’t optimize anything out.)
When writing the structure on the C++ side, you might write something like this:
struct Light {
glm::vec3 position;
glm::vec3 color;
} light;
light.position = {1, 5, 0};
light.color = {3, 2, -1};
For this example I used the debug printf system, which is part of the Vulkan SDK so we can confirm the exact values. The output is as follows:
Position = (1.000000, 5.000000, 0.000000)
Color = (2.000000, -1.000000, 0.000000)
As you can see, the first value of color is getting chopped off when reading it in the shader. The usual solution to the problem is to use a vec4 instead:
struct Light {
glm::vec4 position;
glm::vec4 color;
};
And to confirm, this does indeed fix the issue:
Position = (1.000000, 5.000000, 0.000000)
Color = (3.000000, 2.000000, -1.000000)
But why does it work when we change to it a vec4? This section from the Vulkan specification spells it out for us:
- The base alignment of the type of an OpTypeStruct member is defined recursively as follows:
- A scalar has a base alignment equal to its scalar alignment.
- A two-component vector has a base alignment equal to twice its scalar alignment.
- A three- or four-component vector has a base alignment equal to four times its scalar alignment.
The third bullet point hits it right on the head, vec4 and vec3 have the same alignment! An alternative solution could be to use alignas:
struct Light {
glm::vec3 color;
alignas(16) glm::vec3 position;
};
There’s a bunch of more nitty and dirty alignment issues that stem from differences between C++ and GLSL, and this is one of those cases. In my opinion, this shouldn’t be nessecary for the programmer to handle themselves.
Passing booleans
Another example of esoteric shader rules is when you try passing booleans. Take a look at this C++ structure, which seems okay at first glance:
struct TestBuffer {
bool a = false;
bool b = true;
bool c = false;
bool d = true;
};
And this is how it’s defined in GLSL:
layout(binding = 0) buffer readonly TestBuffer {
bool a, b, c, d;
};
When sent to the shader, the values of the structure end up like this:
a = 1, b = 0, c = 0, d = 0
This is because because SPIR-V doesn’t seem to define a physical size for bool, so it could be represented as anything (like an unsigned integer). In this case, you actually want to define them as integer:
layout(binding = 0) buffer readonly TestBuffer {
int a, b, c, d;
};
This is a little disappointing, because the semantic meaning of a boolean option is lost when you declare them as integers. You can also pack a lot of booleans into the space of one 32-bit integer, which could be a possible space-saving optimization in the future.
Sharing structures
The last problem is keeping the structures in sync. There’s usually one instance of the structure written in C++ and many copies in GLSL shaders. This is problematic because member order could change, so parts of the structure itself could be undefined and can easily escape notice. Having one definition for all shaders and C++ would be a huge improvement!
Struct compiler
What I ended up with is a new pre-processing step, which I called the “struct compiler”. I tried searching on the Internet to see if someone has already made a tool like this, but couldn’t find much – maybe shader reflection is more popular. I did learn a lot from making this tool anyway. It’s main goals are:
- Define the shader structures in one, centralized file.
- Structures should be able to be written on a higher-level, allowing us to decouple the actual member order, alignment and packing from the logic. This enables the compiler to optimize the structure in the future, maybe beyond what we can reasonably hand-write.
- The structure is usable in GLSL and C++.
First you write a .struct file, describing the required members and their types. Here’s the same post-processing structure showcased in the beginning, but now written in the compiler’s custom syntax:
primary PostPushConstant {
viewport: vec4
camera_proj: mat4
inv_proj: mat4
inv_view: mat4
enable_aa: bool
enable_dof: bool
exposure: float
display_color_space: int
tonemapping: int
ao_radius: float
ao_r2: float
ao_rneginvr2: float
ao_rdotvbias: float
ao_intensity: float
ao_bias: float
}
This looks much better, doesn’t it? Even without knowing anything else about the actual shader, you can guess which options do what with some accuracy. Here’s what it might look like, compiled to C++:
struct PostPushConstant {
glm::mat4 camera_proj;
glm::mat4 inv_proj;
glm::mat4 inv_view;
glm::vec4 viewport;
glm::ivec4 enable_aa_enable_dof_display_color_space_tonemapping_;
glm::vec4 exposure_ao_radius_ao_r2_ao_rneginvr2_;
glm::vec4 ao_rdotvbias_ao_intensity_ao_bias_;
...
};
(Setters like set_exposure() and set_exposure() are used instead of accessing the glm::vec4 manually.)
I hook the generation step in my buildsystem to automatically run, so all you need to do is include the auto-generated header. To use the structure in GLSL, I created a new directive that inserts the GLSL version of the structure given by the struct compiler. The same system that generates the C++ headers also generates GLSL which inserts where this directive is found:
#use_struct(push_constant, post, post_push_constant)
(The syntax could use some work, but the first argument is the usage, and the second argument is the name of the struct. The third argument is a unique name for the instance.)
Since the member order and names are undefined, you must access the members by a setter/getter in GLSL and C++. I think this is a worthwhile trade-off for readable code.
vec3 ao_result = pow(ao, ao_intensity())
This tool runs as a pre-processing step offline, before shader compilation begins. The tool’s source code is available here, which is taken from one of my personal projects. It’s quickly written and I don’t recommend using it directly, but I’m confident that this idea is worth pursuing.
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.