SDL detective work

One of the most frustrating but also most rewarding things about software development is solving problems that you have no idea, at first, how to solve. When there's no clear path forward, you have to rely on intuition, detective work, and trial and error.

My SDL 2 branch of OpenTTD wasn't rendering correctly for a few people, and it seemed like they were having the same problem. I tried re-creating their environments: I booted up an Arch Linux VirtualBox and tested my branch in there, because that was the environment in one report. I knew it was unlikely that the distribution would make a difference but I tried anyways, and still couldn't reproduce the problem.

I also found out that one thing these two reports had in common was that they were both using Nvidia graphics cards. I don't have access to any computers with an Nvidia card, so that could explain why I can't reproduce the problem. Everything I'm using has an Intel graphics device.

That was a good lead, so I searched for similar problems online specific to SDL 2 and Nvidia. I didn't find anything, likely because I'm using an outdated 2D rendering API (SDL_UpdateWindowSurfaceRects) because of OpenTTD's custom blitters. SDL 2 applications typically use SDL_UpdateTexture to update the screen, using a whole different data structure (SDL_Texture) that has the possibility of being hardware accelerated. So, it's not surprising that I'm running into some kinks here. That's what happens when you do things in not-quite-expected ways.

So, back to the main problem: it's obvious that my SDL 2 driver is working as far as I can see: why are these Nvidia cards having trouble? There must be something that SDL is doing that's preventing things from working: some key code path inside SDL that could illuminate things.

I found another clue just searching for "NVIDIA" inside SDL's video driver:

static SDL_bool
ShouldUseTextureFramebuffer()
{
    ...

        if (vendor &&
            (SDL_strstr(vendor, "ATI Technologies") ||
             SDL_strstr(vendor, "NVIDIA"))) {
            hasAcceleratedOpenGL = SDL_TRUE;
        }
    ...

    return hasAcceleratedOpenGL;
    ...
}

Thanks to explicit naming conventions and clear code, the problem was becoming more obvious: I don't want to be using the texture framebuffer at all! Textures are for hardware acceleration. I'm only dealing with old-school SDL_Surfaces, again, because of OpenTTD's custom blitters.

This ties together with another clue, seeing how SDL_UpdateWindowSurfaceRects() is implemented:

SDL_UpdateWindowSurfaceRects(SDL_Window * window, const SDL_Rect * rects,
                             int numrects)
{
    CHECK_WINDOW_MAGIC(window, -1);

    if (!window->surface_valid) {
        return SDL_SetError("Window surface is invalid, please call SDL_GetWindowSurface() to get a new surface");
    }

    return _this->UpdateWindowFramebuffer(_this, window, rects, numrects);
}

And _this->UpdateWindowFramebuffer() is replaced with a texture updater based on the value of ShouldUseTextureFramebuffer() from above:

/* Add the renderer framebuffer emulation if desired */
if (ShouldUseTextureFramebuffer()) {
    _this->CreateWindowFramebuffer = SDL_CreateWindowTexture;
    _this->UpdateWindowFramebuffer = SDL_UpdateWindowTexture;
    _this->DestroyWindowFramebuffer = SDL_DestroyWindowTexture;
}

Alright, so how do I avoid this code path and get ShouldUseTextureFramebuffer() to return False when SDL starts up? There are many cases to take advantage of at the beginning of this function. Here's what I decided to use:

/* See if the user or application wants a specific behavior */
hint = SDL_GetHint(SDL_HINT_FRAMEBUFFER_ACCELERATION);
if (hint) {
    if (*hint == '0' || SDL_strcasecmp(hint, "false") == 0) {
        return SDL_FALSE;
    } else {
        return SDL_TRUE;
    }
}

So, by adding this hint in OpenTTD to turn off acceleration, the Nvidia issue is solved.

SDL_SetHint(SDL_HINT_FRAMEBUFFER_ACCELERATION , "0");