Background BLP texture pre-decoding + deferred WMO normal maps (12x streaming perf)

Move CPU-heavy BLP texture decoding from main thread to background worker
threads for all hot paths: terrain M2 models, WMO doodad M2s, WMO textures,
creature models, and gameobject WMOs. Each renderer (M2, WMO, Character) now
accepts a pre-decoded BLP cache that loadTexture() checks before falling back
to synchronous decode.

Defer WMO normal/height map generation (3 per-pixel passes: luminance, box
blur, Sobel) during terrain streaming finalization — this was the dominant
remaining bottleneck after BLP pre-decoding.

Terrain streaming stalls: 1576ms → 124ms worst case.
This commit is contained in:
Kelsi 2026-03-07 15:46:56 -08:00
parent 0313bd8692
commit 7ac990cff4
13 changed files with 573 additions and 109 deletions

View file

@ -1,5 +1,6 @@
#pragma once
#include "pipeline/blp_loader.hpp"
#include <vulkan/vulkan.h>
#include <vk_mem_alloc.h>
#include <glm/glm.hpp>
@ -325,6 +326,12 @@ public:
// Pre-compute floor cache for all loaded WMO instances
void precomputeFloorCache();
// Pre-decoded BLP cache: set before calling loadModel() to skip main-thread BLP decode
void setPredecodedBLPCache(std::unordered_map<std::string, pipeline::BLPImage>* cache) { predecodedBLPCache_ = cache; }
// Defer normal/height map generation during streaming to avoid CPU stalls
void setDeferNormalMaps(bool defer) { deferNormalMaps_ = defer; }
private:
// WMO material UBO — matches WMOMaterial in wmo.frag.glsl
struct WMOMaterialUBO {
@ -558,6 +565,7 @@ private:
* Load a texture from path
*/
VkTexture* loadTexture(const std::string& path);
std::unordered_map<std::string, pipeline::BLPImage>* predecodedBLPCache_ = nullptr;
/**
* Generate normal+height map from diffuse RGBA8 pixels
@ -670,6 +678,7 @@ private:
// Normal mapping / POM settings
bool normalMappingEnabled_ = true; // on by default
bool deferNormalMaps_ = false; // skip normal map gen during streaming
float normalMapStrength_ = 0.8f; // 0.0 = flat, 1.0 = full, 2.0 = exaggerated
bool pomEnabled_ = true; // on by default
int pomQuality_ = 1; // 0=Low(16), 1=Medium(32), 2=High(64)