Skip to content

Efficiency Matters! Streamlining your Modern UX with Compressed Textures

Holoplot UX – Example of modern user interface that could benefit from compressed textures. (KDAB designed UX, photo courtesy of Holoplot.)

It’s every programmer’s worst nightmare. Your beautiful app is running at a snail’s pace, crippled by virtual memory swapping. Even worse, you’ve added one last bitmap resource, and suddenly unrelated chunks of the UX aren’t showing up!

Desktop machines have become powerful enough that programmers rarely worry about performance issues, and today’s embedded systems usually have enough horsepower at their disposal. But in constructing sophisticated user interfaces with more and more elaborate imagery and high-definition backgrounds, we are often sapping one resource that’s always scarce: RAM. That’s especially true for the dedicated video RAM (VRAM) that sits on the GPU.

With ever-increasing screen densities, it’s easy to see why VRAM is in such demand. An iPad with a retina display is 2048×1536 pixels, and at 4 bytes per pixel, that’s a whopping 12.5MB of RAM for just a static background image. Toss in a few more large bitmaps, masks, or textures, and you can be consuming dozens or hundreds of megabytes of VRAM, not to mention regular RAM.

“Not me! All my images are compressed JPEGs or PNGs,” you say. Yes, but Qt Quick 2 (QQ2) uncompresses images on loading them, so your svelte, skinny PNGs uncompress into giant blocks of RAM on program start. And those bitmaps take an equivalent bite out of VRAM when they’re displayed. Making one-bit deep masks or using a 16- or 8-bit color depth doesn’t really help, as everything gets expanded out into 32-bit RGBA on loading.

What’s an enterprising Qt Quick 2 programmer to do? Take a tip from our game engine friends, that’s what. Modern video games load and manage hundreds of large, high-quality textures, and they get away with it by using custom compressed textures. The textures are directly loaded off disk and left in a compressed format that the GPU can understand. These formats aren’t JPEG or PNG—they’re highly specialized formats that need a time-consuming conversion process. But the GPU can directly read those formats and uncompress them to the display on-the-fly, meaning that your RAM (and VRAM) only takes the hit for the compressed sizes, not the fully expanded size. That can result in dramatic savings of both RAM and VRAM.

We don’t want to reinvent all Qt’s display machinery, so unless we can convince Qt to compress textures, any possible savings would be academically nice, but practically impossible. Fortunately, through the QOpenGLTexture class (a KDAB contribution, btw), QQ2, provides all the necessary APIs to let us change the underlying behavior to use compressed textures without mucking around in the internals.

Overriding the QSGTexture class

    class CompressedSGTexture : public QSGTexture
    {
        Q_OBJECT
    
    public:
        CompressedSGTexture(const PKMImage &image);
        ~CompressedSGTexture();
        void bind() Q_DECL_OVERRIDE;
        bool hasAlphaChannel() const Q_DECL_OVERRIDE;
        bool hasMipmaps() const Q_DECL_OVERRIDE;
    
        int textureId() const Q_DECL_OVERRIDE;
        QSize textureSize() const Q_DECL_OVERRIDE;
    
    private:
        PKMImage m_image;
        QScopedPointer ⟨QOpenGLTexture⟩ m_texture;
    };

The first step is to compress the images using a GPU-friendly compression scheme. Unfortunately there are a number of these formats, many are proprietary and not well documented, and they aren’t readily exportable from common image-editing tools like GIMP or Photoshop. Thankfully though, ARM includes, as part of their ARMMali visual technology suite, a freely available and great tool that deals with a number of the most encountered compression formats: the Mali GPU Texture Compression Tool. Using the Mali GPU Texture compression tool to compress some samples images at the highest quality setting took over 30 minutes on a decent machine, so you may want to settle for slightly less than perfection!

Loading compressed NVIDIA GPU PKM files for the texture provider

QQuickTextureFactory *CompressedTextureImageProvider::requestTexture(const QString &id, QSize *size, const QSize &requestedSize)
{
    Q_UNUSED(requestedSize);

    QFile imageFile(QStringLiteral(“:/%1.pkm”).arg(id));
    if (!imageFile.open(QFile::ReadOnly))
        return 0;

    QByteArray header = imageFile.read(PKM_HEADER_LENGTH);
    if (header.length() != PKM_HEADER_LENGTH) {
        qWarning() ⟨⟨ “PKM header too short”;
        return 0;
    }

    PKMImage image;
    const char *headerData = header.constData();

    // Parse the PKM header
    if (memcmp(headerData, PKM_HEADER_PREAMBLE, PKM_HEADER_PREAMBLE_LENGTH) != 0) {
        qWarning() ⟨⟨ “Malformed PKM header (missing heading)”;
        return 0;
    }
    headerData += 4;

    if (memcmp(headerData, PKM_HEADER_VERSION, PKM_HEADER_VERSION_LENGTH) != 0) {
        qWarning() ⟨⟨ “Malformed PKM header (wrong version)”;
        return 0;
    }
    headerData += 2;

#define UCC(x) (reinterpret_cast⟨const uchar *⟩(x))
    const quint16 dataType = qFromBigEndian⟨quint16⟩(UCC(headerData));
    if (dataType != PKM_ETC2_RGB_NO_MIPMAPS) {
        qWarning() ⟨⟨ “Malformed PKM header (wrong data type)”;
        return 0;
    }
    headerData += 2;

    image.effectiveSize.rwidth() = qFromBigEndian⟨quint16⟩(UCC(headerData));
    headerData += 2;
    image.effectiveSize.rheight() = qFromBigEndian⟨quint16⟩(UCC(headerData));
    headerData += 2;
    image.originalSize.rwidth() = qFromBigEndian⟨quint16⟩(UCC(headerData));
    headerData += 2;
    image.originalSize.rheight() = qFromBigEndian⟨quint16⟩(UCC(headerData));
    headerData += 2;
#undef UCC

    Q_ASSERT(image.effectiveSize.width() % 4 == 0);
    Q_ASSERT(image.effectiveSize.height() % 4 == 0);
    Q_ASSERT(headerData == header.constEnd());

    // Read out the payload
    const qint64 imageDataLength = ((image.effectiveSize.width() / 4) * (image.effectiveSize.height() / 4)) * 8;
    image.data = imageFile.read(imageDataLength);
    if (image.data.length() != imageDataLength) {
        qWarning() ⟨⟨ “Malformed PKM file: payload too short”;
        return 0;
    }

    if (!imageFile.atEnd()) {
        qWarning() ⟨⟨ “Malformed PKM file: data after the payload”;
        return 0;
    }

    if (size)
        *size = image.effectiveSize;

    return new CompressedTextureFactory(image);
}

Once you have your images compressed in a GPU-digestible way, the QQ2-related code is pretty straightforward with QOpenGLTexture doing most of the heavy lifting. Here’s sample code (including all the assorted code snippets) with the tweaks you’ll need. Note that this sample code is not a full executing sample and just a proof of concept, so #include <all standard disclaimers> …

To pull off the compressed texture magic, we need a custom image provider (CompressedTextureImageProvider), a custom texture provider (derived from QQuickTextureFactory), and a custom QSGTexture subclass (CompressedSGTexture in the sample code). Although the compressed texture is nearly the same size as the original PNG, the RAM/VRAM savings for a background image during runtime is about 12MB!

Compressed Texture Factory

class CompressedTextureFactory : public QQuickTextureFactory
{

    Q_OBJECT
public:
    CompressedTextureFactory(const PKMImage &amp;image);

    QSGTexture *createTexture(QQuickWindow *window) const Q_DECL_OVERRIDE;
    QImage image() const Q_DECL_OVERRIDE;
    int textureByteCount() const Q_DECL_OVERRIDE;
    QSize textureSize() const Q_DECL_OVERRIDE;

private:
    PKMImage m_image;
};

CompressedTextureFactory::CompressedTextureFactory(const PKMImage &amp;image)
    : m_image(image)
{
}

QSGTexture *CompressedTextureFactory::createTexture(QQuickWindow *window) const
{
    Q_UNUSED(window);
    return new CompressedSGTexture(m_image);
}

QImage CompressedTextureFactory::image() const
{
    // FIXME/TODO: we can't easily get a QImage out of compressed texture data;
    // uncompressing image left as an exercise for the reader. This function
    // isn't called under normal circumstances...
    return QImage();
}

int CompressedTextureFactory::textureByteCount() const
{
    return m_image.data.length();
}

QSize CompressedTextureFactory::textureSize() const
{
    return m_image.effectiveSize;
}

The other big advantage of compressed textures is CPU utilization, especially during program initialization. Everyone appreciates faster program start times, and that’s especially true on constrained devices like tablets, mobiles, or embedded devices. Not only are we skipping the decompression step during the load phase, but we’re also minimizing the size of copies between RAM and VRAM.

Tons less memory and faster execution for a sprinkling of calls and an extra step added to the build process. Not a bad payoff!

About KDAB

KDAB is a consulting company offering a wide variety of expert services in Qt, C++ and 3D/OpenGL and providing training courses in:

KDAB believes that it is critical for our business to contribute to the Qt framework and C++ thinking, to keep pushing these technologies forward to ensure they remain competitive.

Categories: KDAB Blogs / KDAB on Qt / OpenGL / Qt3D

Tags: / / / / /

2 thoughts on “Efficiency Matters!”

  1. Nice post!
    But how can I deal with images dynamically loaded in the app? For example an image viewer application which loads images from a network share (displaying a large amount of images in a grid view). I think compressing them is not an option as it takes to much time (?)

    1. Andy Gryc

      Do you have the option to pre-compress the images in place? That will double the storage requirement but if it’s on a network share, that’s probably not an issue. Then if the GPU-compressed version is available your app can use it, and if not you’d default to the standard image. Do the pre-compress as a separate background thread, just like creating thumbnails. Not sure if that works for your app (and may not be worth the hassle), but it’s the only thing that comes to mind.

Leave a Reply

Your email address will not be published. Required fields are marked *