Emote animations: fix DBC chain for /laugh, /flirt, /sleep, /fart, /stink.
Previously all emotes with AnimID=0 used emoteRef as animId (wrong DBC
record IDs). Now resolves through Emotes.dbc properly, with per-emote
overrides for emotes whose DBC chain yields 0. Adds Emotes.dbc load
failure warning and diagnostic logging.
WMO culling: skip portal culling when camera is outside all groups (fixes
vanishing Stormwind ground tiles). Also handle indoor/outdoor AABB overlap
by showing all groups when position is in both indoor and outdoor AABBs.
Transport: clear ONTRANSPORT flag and transport state when transport not
found, preventing stale transport data from teleporting player to map
origin. Add area trigger safety net near (0,0,0) on Eastern Kingdoms.
M2 GPU instancing
- M2InstanceGPU SSBO (96 B/entry, double-buffered, 16384 max)
- Group opaque instances by (modelId, LOD); single vkCmdDrawIndexed per group
- boneBase field indexes into mega bone SSBO via gl_InstanceIndex
Indirect terrain drawing
- 24 MB mega index buffer (6M uint32) + 64 MB mega vertex buffer
- CPU builds VkDrawIndexedIndirectCommand per visible chunk
- Single VB/IB bind per frame; shadow pass reuses mega buffers
- Replaced vkCmdDrawIndexedIndirect with direct vkCmdDrawIndexed to fix
host-mapped buffer race condition that caused terrain flickering
GPU frustum culling (compute shader)
- m2_cull.comp.glsl: 64-thread workgroups, sphere-vs-6-planes + distance cull
- CullInstanceGPU SSBO input, uint visibility[] output, double-buffered
- dispatchCullCompute() runs before main pass via render graph node
Consolidated bone matrix SSBOs
- 16 MB double-buffered mega bone SSBO (2048 instances × 128 bones)
- Eliminated per-instance descriptor sets; one megaBoneSet_ per frame
- prepareRender() packs bone matrices consecutively into current frame slot
Render graph / frame graph
- RenderGraph: RGResource handles, RGPass nodes, Kahn topological sort
- Automatic VkImageMemoryBarrier/VkBufferMemoryBarrier between passes
- Passes: minimap_composite, worldmap_composite, preview_composite,
shadow_pass, reflection_pass, compute_cull
- beginFrame() uses buildFrameGraph() + renderGraph_->execute(cmd)
Pipeline derivatives
- PipelineBuilder::setFlags/setBasePipeline for VK_PIPELINE_CREATE_DERIVATIVE_BIT
- M2 opaque = base; alphaTest/alpha/additive are derivatives
- Applied to terrain (wireframe) and WMO (alpha-test) renderers
Rendering bug fixes:
- fix(shadow): compute lightSpaceMatrix before updatePerFrameUBO to eliminate
one-frame lag that caused shadow trails and flicker on moving objects
- fix(shadow): scale depth bias with shadowDistance_ instead of hardcoded 0.8f
to prevent acne at close range and gaps at far range
- fix(visibility): WMO group distance threshold 500u → 1200u to match terrain
view distance; buildings were disappearing on the horizon
- fix(precision): camera near plane 0.05 → 0.5 (ratio 600K:1 → 60K:1),
eliminating Z-fighting and improving frustum plane extraction stability
- fix(streaming): terrain load radius 4 → 6 tiles (~2133u → ~3200u) to exceed
M2 render distance (2800u) and eliminate pop-in when camera turns;
unload radius 7 → 9; spawn radius 3 → 4
- fix(visibility): ground-detail M2 distance multiplier 0.75 → 0.9 to reduce
early pop of grass and debris
deferAfterFrameFence only waits for one frame slot's fence, but shared
resources (material descriptor sets, vertex/index buffers) are bound by
both in-flight frames' command buffers. On AMD RADV this caused
vkFreeDescriptorSets errors and eventual SIGSEGV.
Add deferAfterAllFrameFences: queues to every frame slot with a shared
counter so cleanup runs exactly once, after the last slot is fenced.
Use it for WMO, terrain, water, and character model shared resources.
Per-frame bone sets keep using deferAfterFrameFence (already correct).
Also fix character renderer vertex format: R8G8B8A8_UINT -> _SINT to
match shader's ivec4 input (RADV validation rejects the mismatch).
M2 destroyInstanceBones and WMO destroyGroupGPU freed descriptor sets
and buffers immediately during tile streaming, while in-flight command
buffers still referenced them — causing DEVICE_LOST on AMD RADV.
Now defers GPU resource destruction via deferAfterFrameFence in streaming
paths (removeInstance, removeInstances, unloadModel). Immediate
destruction preserved for shutdown/clear paths that vkDeviceWaitIdle
first.
Also: vkDeviceWaitIdle before WMO backfillNormalMaps descriptor rebinds,
and fillModeNonSolid added to required device features for wireframe
pipelines on AMD.
M2 particle/ribbon/batch, terrain layer, and WMO material texture
resolution paths were silently falling back to white textures when
indices were out of range — making missing texture issues hard to
diagnose. Add LOG_WARNING at each silent failure point with model
name, index details, and array sizes.
- character_renderer: extract duplicated fallback texture creation
(white/transparent/flat-normal) into createFallbackTextures() — was
copy-pasted between initialize() and clear()
- wmo_renderer: replace magic 8192 with kMaxRetryTracked constant,
add why-comment explaining the fallback-retry set cap (Dalaran has
2000+ unique WMO groups)
- quest_handler: add why-comment on reqCount=0 fallback — escort/event
quests can report kill credit without objective counts in query response
Final sweep across mpq_manager, application, auth_screen, wmo_renderer,
character_renderer, and terrain_manager. All now use the unsigned char
cast pattern. No remaining bare ::tolower/::toupper or std::tolower(c)
calls on signed char in the codebase.
- Replace count()+operator[] double lookups with find() or try_emplace()
in gameObjectInstances_, playerTextureSlotsByModelId_, onlinePlayerAppearance_
- Add Entity::isUnit() helper; replace 5 dynamic_cast<Unit*> in per-frame
UI rendering (nameplates, combat text, pet frame) with isUnit()+static_cast
- Add constexpr kInv255 reciprocal for per-pixel normal map generation loops
in character_renderer and wmo_renderer
Spline parsing: remove Classic format fallback from the WotLK parser. The
PacketParsers hierarchy already dispatches to expansion-specific parsers
(Classic/TBC/WotLK/Turtle), so the WotLK parseMovementBlock should only
attempt WotLK spline format. The Classic fallback could false-positive when
durationMod bytes resembled a valid point count, corrupting downstream parsing.
Preload DBC caches: call loadSpellNameCache() and 5 other lazy DBC caches
during handleLoginVerifyWorld() on initial world entry. This moves the ~170ms
Spell.csv load from the first SMSG_SPELL_GO handler to the loading screen,
eliminating the mid-gameplay stall.
WMO portal culling: move per-instance portalVisibleGroups vector and
portalVisibleGroupSet to reusable member variables, eliminating heap
allocations per WMO instance per frame.
Squared distance optimizations across 30 files:
- Convert glm::length() comparisons to glm::dot() (no sqrt)
- Use glm::inversesqrt() for check-then-normalize patterns (1 rsqrt vs 2 sqrt)
- Defer sqrt to after early-out checks in collision/movement code
- Hottest paths: camera_controller (21), weather particles, WMO collision,
transport movement, creature interpolation, nameplate culling
Container and algorithm improvements:
- std::map<string> → std::unordered_map for asset/DBC/MPQ/warden caches
- std::mutex → std::shared_mutex for asset_manager and mpq_manager caches
- std::sort → std::partial_sort in lighting_manager (top-2 of N volumes)
- Double-lookup find()+operator[] → insert_or_assign in game_handler
- Add reserve() for per-frame vectors: weather, swim_effects, WMO/M2 collision
Threading and synchronization:
- Replace 1ms busy-wait polling with condition_variable in character_renderer
- Move timestamp capture before mutex in logger
- Use memory_order_acquire/release for normal map completion signaling
API additions:
- DBC getStringView()/getStringViewByOffset() for zero-copy string access
- Parse creature display IDs from SMSG_CREATURE_QUERY_SINGLE_RESPONSE
Replace ~37 remaining C-style casts with static_cast across 16 files.
Extract named color constants (kColorRed/Green/Yellow/Gray) and dialog
window flags (kDialogFlags) in game_screen.cpp, replacing 72 inline
literals. Normalize keybinding_manager.hpp to #pragma once.
Create a VkPipelineCache at device init, loaded from disk if available.
All 65 pipeline creation calls across 19 renderer files now use the
shared cache. On shutdown, the cache is serialized to disk so subsequent
launches skip redundant shader compilation.
Cache path: ~/.local/share/wowee/pipeline_cache.bin (Linux),
~/Library/Caches/wowee/ (macOS), %APPDATA%\wowee\ (Windows).
Stale/corrupt caches are handled gracefully (fallback to empty cache).
The wall/floor classification threshold was lowered from 0.65 to 0.35
in a prior optimization commit, causing surfaces at 35-65° from
horizontal (steep walls, angled building geometry) to be classified as
floors and skipped during wall collision. This allowed the player to
clip through angled WMO walls.
Restore the threshold to 0.65 (cos 50°) in both the collision grid
builder and the runtime checkWallCollision skip, matching the
MAX_WALK_SLOPE limit used for slope-slide physics.
gameObjectDisplayIdWmoCache_ was not cleared on world unload/transition,
causing stale WMO model IDs (e.g. 40006, 40003) to be looked up after
the renderer cleared its model list, resulting in "Cannot create instance
of unloaded WMO model" errors on zone re-entry.
Changes:
- Clear gameObjectDisplayIdWmoCache_ alongside other GO caches on world reset
- Add WMORenderer::isModelLoaded() for cache-hit validation
- Inline GO WMO path now verifies cached model is still renderer-resident
before using it; evicts stale entries and falls back to reload
wmo_renderer: when portal BFS starts from a group with no portal refs
(utility/transition group), the rest of the WMO becomes invisible because
BFS only adds the starting group. Fix: if cameraGroup has portalCount==0,
fall back to marking all groups visible (same as camera-outside behavior).
renderer: minimap player-orientation arrow was pointing the wrong direction.
The shader convention is arrowRotation=0→North, positive→clockwise (West),
negative→East. The correct mapping from canonical yaw is arrowRotation =
-canonical_yaw. Fixed both render paths (cameraController and gameHandler)
in both the FXAA and non-FXAA minimap render calls.
World-map W-key: already corrected in prior commit (13c096f); users with
stale ~/.wowee/settings.cfg should rebind via Settings > Controls.
cleanupUnusedModels() runs every 5 seconds and freed vertex/index buffers
without waiting for the GPU to finish the previous frame's command buffer.
This caused VK_ERROR_DEVICE_LOST (-4) after extended gameplay when tiles
stream out and their models are freed mid-render.
Add vkDeviceWaitIdle() before the buffer destroy loop in both M2Renderer
and WMORenderer cleanupUnusedModels(). The wait only happens when there are
models to remove, so quiet sessions have no overhead.
Replace the basic forward-vector culling (which only culls when all AABB
corners are behind the camera) with proper frustum-AABB intersection testing
for more accurate and aggressive visibility culling. This reduces overdraw
and improves rendering performance in WMO-heavy scenes (dungeons, buildings).
Move ShadowParamsUBO from 5 separate shadow rendering functions (2 in
m2_renderer, 1 in terrain_renderer, 1 in wmo_renderer) into shared
vk_frame_data.hpp header. Eliminates 5 identical local struct definitions
and improves consistency across all shadow pass implementations. Structure
layout matches shader std140 uniform buffer requirements.
Move envSizeMBOrDefault and envSizeOrDefault from 4 separate rendering
modules (character_renderer, m2_renderer, terrain_renderer, wmo_renderer)
into shared vk_utils.hpp header as inline functions. Use the most robust
version which includes overflow checking for MB-to-bytes conversion. This
eliminates 7 identical local function definitions and improves consistency
across all rendering modules.
Move ShadowPush from 4 separate rendering modules (character_renderer,
m2_renderer, terrain_renderer, wmo_renderer) into shared vk_frame_data.hpp
header. This eliminates 4 identical local struct definitions and ensures
consistency across all shadow rendering passes. Add vk_frame_data.hpp include
to character_renderer.cpp.
- wmo_renderer: pass character position (not camera position) to portal
visibility traversal — the 3rd-person camera can orbit outside a WMO
while the character is inside, causing interior groups to cull; render()
now accepts optional viewerPos that defaults to camPos for compatibility
- renderer: pass &characterPosition to wmoRenderer->render() at both
main and single-threaded call sites; reflection pass keeps camPos
- renderer: apply mount pitch/roll to rider during all flight, not just
taxiFlight_ (fixes zero rider tilt during player-controlled flying)
- game_screen: format SAY/YELL/WHISPER/EMOTE using WoW-style "Name says:"
instead of "[SAY] Name:" bracket prefix
isPortalVisible() was computing the world-space AABB by directly
transforming pMin/pMax with the model matrix. This is incorrect for
rotated WMOs — when the model matrix includes rotations, components can
be swapped or negated, yielding an inverted AABB (worldMin.x >
worldMax.x) that causes frustum.intersectsAABB() to fail.
Fix: transform all 8 corners of the portal bounding box and take the
component-wise min/max, which gives the correct world-space AABB for any
rotation/scale. This was the root cause of portals being incorrectly
culled in rotated WMO instances (e.g. many dungeon and city WMOs).
Also squash the earlier spline-speed no-op fix (parse guid + float
instead of consuming the full packet for SMSG_SPLINE_SET_FLIGHT_SPEED
and friends) into this commit.
Read the ambient color from the MOHD chunk (BGRA uint32) and store it
on WMOModel as a normalized RGB vec3. Pass it through ModelData into
the per-batch WMOMaterialUBO (replacing the unused pad[3] bytes, keeping
the struct at 64 bytes). The GLSL interior branch now floors vertex
colors against the WMO ambient instead of a hardcoded 0.5, so dungeon
interiors respect the artist-specified ambient tint from the WMO root
rather than always clamping to grey.
- Overlap M2 and character animation updates via std::async (~2-5ms saved)
- Thread-local collision scratch buffers for concurrent floor queries
- Parallel terrain/WMO/M2 floor queries in camera controller
- Seed new M2 instance bones from existing siblings to eliminate pop-in flash
- Fix shadow flicker: snap center along stable light axes instead of in view space
- Increase shadow distance default to 300 units (slider max 500)
Move CPU-heavy BLP texture decoding from main thread to background worker
threads for all hot paths: terrain M2 models, WMO doodad M2s, WMO textures,
creature models, and gameobject WMOs. Each renderer (M2, WMO, Character) now
accepts a pre-decoded BLP cache that loadTexture() checks before falling back
to synchronous decode.
Defer WMO normal/height map generation (3 per-pixel passes: luminance, box
blur, Sobel) during terrain streaming finalization — this was the dominant
remaining bottleneck after BLP pre-decoding.
Terrain streaming stalls: 1576ms → 124ms worst case.
Every uploadBuffer/VkTexture::upload called immediateSubmit which did a
separate vkQueueSubmit + vkWaitForFences. Loading a single creature model
with textures caused 4-8+ fence waits; terrain chunks caused 80+ per batch.
Added beginUploadBatch/endUploadBatch to VkContext: records all upload
commands into a single command buffer, submits once with one fence wait.
Staging buffers are deferred for cleanup after the batch completes.
Wrapped in batch mode:
- CharacterRenderer::loadModel (creature VB/IB + textures)
- M2Renderer::loadModel (doodad VB/IB + textures)
- TerrainRenderer::loadTerrain/loadTerrainIncremental (chunk geometry + textures)
- TerrainRenderer::uploadPreloadedTextures
- WMORenderer::loadModel (group geometry + textures)
- Add magma/slime rendering path to water shader (fbm noise, crust/molten/core coloring)
- Fix WMO liquid height filter rejecting high-altitude zones like Ironforge (Z>300)
- Allow interior WMO magma/slime MLIQ groups to load (skip only water/ocean)
- Mark LAVASTEAM.m2 as spell effect for proper additive blend, hide emission mesh
- Add isLavaModel flag for M2 ForgeLava/LavaPots UV scroll fallback
- Add isLava material detection in WMO renderer for lava texture UV animation
- Fix WMO material UBO colors for magma (was blue, now orange-red)
- Add case-insensitive "glass" detection for WMO window materials
- Make instance (WMO-only) glass highly transparent (12-35% alpha)
so underwater scenes are visible through Deeprun Tram windows
- Keep normal world windows at existing opacity (40-95% alpha)
- Disable shadow mapping for interior WMO groups to fix dark
indoor areas like Ironforge
- Stronger wall collision push (0.35/0.15) and swept push (0.45/0.25)
for interior/exterior WMOs to reduce clipping through tunnel walls
- Use all triangles (not just pre-classified walls) for collision checks
- Allow invisible collidable triangles (MOPY 0x01 without 0x20) to block
- Pass insideWMO flag to all collision callers, match swim sweep to ground
- Widen swept hit detection radius from 0.15 to 0.25
- Restrict camera zoom to 12 units inside WMO interiors
- Fix /unstuck launching player above WMOs: remove +20 fallback, use
gravity when no floor found
- Slash and Enter keys always focus chat unless already typing
WMO origins can be far from their visible geometry, causing large city
buildings to be culled from the shadow pass. Use world bounding box for
instance culling and per-group AABB culling. Also increase WMO shadow
cull radius to match the shadow map coverage (180 units).
WMO interior doodads (gears, decorations) were blocking player movement
via M2 collision. Skip collision for all WMO doodad M2 instances since
the WMO itself handles wall collision.
Also filter WMO wall collision using MOPY per-triangle flags: only
rendered+collidable triangles block the player, skipping invisible
collision hulls.
Revert tram portal extended range (no longer needed with collision fix).
Parse MOPY per-triangle flags in WMO groups and exclude detail/decorative
triangles (flag 0x04) from collision detection. This prevents invisible
walls from objects like gears and railings in WMO interiors.
Add WotLK area trigger IDs 2173/2175 to extended-range tram triggers.
- Fix Stormwind barracks floor: interior WMO groups named "facade" were
incorrectly marked as LOD shells and hidden when close. Add !isIndoor
guard to all LOD detection conditions so interior groups always render.
- Fix water exit stair clipping: anchor lastGroundZ to current position
on swim exit, set grounded=true for full step-up budget, add upward
velocity boost to clear stair lip geometry.
- Re-enable NPC humanoid equipment geosets (kEnableNpcHumanoidOverrides)
so guards render with proper armor instead of underwear.
- Keep instance portal GameObjects animated (spinning/glowing) instead
of freezing all GO animations indiscriminately.
- Fix equipment disappearing after instance round-trip by resetting
dirty tracking on world reload.
- Fix multi-doodad-set loading: load both set 0 (global) and placement-
specific doodad set, with dedup to avoid double-loading.
- Clear placedWmoIds in softReset/unloadAll to prevent stale dedup.
- Apply MODF rotation to instance WMOs, snap player to WMO floor.
- Re-enable rebuildSpatialIndex in setInstanceTransform.
- Store precomputeFloorCache results in precomputed grid.
- Add F8 debug key for WMO floor diagnostics at player position.
- Expand mapIdToName with all Classic/TBC/WotLK instance map IDs.
Low-vertex groups (<100 verts) were incorrectly marked as distance-only
LOD shells in small WMOs like Stockades. Now only applies this heuristic
to large WMOs (50+ groups) where it's needed for city exterior shells.
Add null checks for vertex/index buffers, pipelines, and zero-count
draws in WMO render path. The shadow pass already had buffer validation
but the main render() was missing it, which could cause GPU crashes
on WMO-only maps like Stockades (26 groups).
Reset descriptor pools in CharacterRenderer/M2Renderer/WMORenderer on map
change to prevent VK_ERROR_DEVICE_LOST from pool exhaustion. Defer re-entrant
SMSG_NEW_WORLD during active world load to avoid recursive cleanup crashes.
Gate swim bubbles on swimming state, skip redundant shadow pipeline re-init,
add WOWEE_SKIP_* env vars for render isolation debugging.
- Fix WDT chunk magic constants to big-endian ASCII (matching ADTLoader)
- Add minimum effective size for box area triggers (90 units, like sphere 45-unit radius)
- Add areaTriggerSuppressFirst_ flag to prevent portal ping-pong on map transfer
- Add WMORenderer::clearAll() to clear models/textures on map change (prevents GPU crash)
- Increase WMO texture cache default to 8GB
- Fix setMapName called after loadTestTerrain so WMO renderer exists
- Save/restore player position around CMSG_AREATRIGGER to prevent bad DB persistence
StormLib's bundled libtomcrypt uses x86 inline assembly (bswapl/movl)
gated by __MINGW32__, which is defined on CLANGARM64 too. Pass
-DLTC_NO_BSWAP to force portable C byte-swap fallback.
Replace monolithic finalizeTile() with a phased state machine that spreads
GPU upload work across multiple frames (TERRAIN→M2→WMO→WATER→AMBIENT→DONE).
Each advanceFinalization() call does one bounded unit of work within the
per-frame time budget, eliminating 50-300ms frame hitches when entering cities.
Additional performance improvements:
- Pre-allocate bone SSBOs at M2 instance creation instead of lazily during
first render frame, preventing hitches when many skinned characters appear
- Enable WMO distance culling (800 units) with active-group exemption so
the player's current floor/neighbors are never culled
- Add 4-tier adaptive M2 render distance (250/400/600/1000 based on count)
- Remove dead PendingM2Upload queue code superseded by incremental system
Fix tile re-enqueueing bug: keep tiles in pendingTiles until committed to
loadedTiles (not when moved to finalizingTiles_) so streamTiles() doesn't
re-enqueue tiles mid-finalization. Also handle unloadTile() for tiles in
the finalizingTiles_ deque to prevent orphaned water/M2/WMO resources.
- Cache material UBO mapped pointers at creation time, eliminating
per-batch vmaGetAllocationInfo() calls in the hot render path
- Precompute foliage/elven/lantern/kobold model name classifications
at load time instead of per-instance string operations every frame
- Remove redundant descriptor set and push constant rebinds on WMO
pipeline switches (preserved across compatible layouts)
- Pre-allocate glow sprite descriptor set once at init instead of
allocating from the pool every frame