Spell visual effects previously used a fixed 3.5s duration for all
effects, causing some to linger too long and overlap during combat.
Now queries the M2 model's default animation duration via the new
getInstanceAnimDuration() method and clamps it to 0.5-5s. Effects
without animations fall back to a 2s default. This makes spell impacts
feel more responsive and reduces visual clutter.
Models that lose all instances are no longer immediately evicted from
GPU memory. Instead they get a 60-second grace period, preventing the
thrash cycle where GO models (barrels, chests, herbs) were evicted
every 5 seconds and re-loaded when the same object type respawned.
Parse M2RibbonEmitter data (WotLK format) from M2 files — bone index,
position, color/alpha/height tracks, edgesPerSecond, edgeLifetime,
gravity. Add CPU-side trail simulation per instance (edge birth at bone
world position, lifetime expiry, gravity droop). New m2_ribbon.vert/frag
shaders render a triangle-strip quad per emitter using the existing
particleTexLayout_ descriptor set. Supports both alpha-blend and additive
pipeline variants based on material blend mode. Fixes invisible spell
trail effects (~5-10%% of spell visuals) that were silently skipped.
Pre-allocate one stable VkDescriptorSet per particle emitter at model
upload time (particleTexSets[]) instead of allocating a new set from
materialDescPool_ every frame for each particle group. The per-frame
path exhausted the 8192-set pool in ~14 s at 60 fps with 10 active
particle emitters, causing GPU device-lost crashes. The old path is
kept as an explicit fallback but should never be reached in practice.
Hang/GPU device lost fix:
- M2_INSTANCES and WMO_INSTANCES finalization phases now create instances
incrementally (32 per step / 4 per step) instead of all at once, eliminating
the >1s main-thread stalls that caused GPU fence timeouts and device loss
M2 two-pass transparent rendering:
- Opaque/alpha-test batches render in pass 1, transparent/additive in pass 2
(back-to-front sorted) to fix wing transparency showing terrain instead of
trees — adds hasTransparentBatches flag to skip models with no transparency
Tile streaming improvements:
- Sort new load queue entries nearest-first so critical tiles load before
distant ones during fast taxi flight
- Increase taxi load radius 6→8 tiles, unload 9→12 for better coverage
Water refraction gated on FSR:
- Disable water refraction when FSR is not active (bugged without upscaling)
- Auto-disable refraction if FSR is turned off while refraction was on
The boneDescPool_ had MAX_BONE_SETS=2048 but sets were never freed when
instances were removed (only when clear() reset the whole pool on map load).
As tiles streamed in/out, each new animated instance consumed 2 pool slots
(one per frame index) permanently. After ~1024 animated instances created
total, vkAllocateDescriptorSets began failing silently and returning
VK_NULL_HANDLE. render() skips instances with null boneSet[frameIndex],
making them invisible — appearing as per-frame flicker as the culling pass
included them but the render pass excluded them.
Fix: destroyInstanceBones() now calls vkFreeDescriptorSets() for each
non-null boneSet before destroying the bone SSBO. The pool already had
VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT set for this purpose.
Also increased MAX_BONE_SETS from 2048 to 8192 for extra headroom.
Root cause: bonesDirty was a single bool shared across both
double-buffered frame indices. When bones were copied to frame 0's
SSBO and bonesDirty cleared, frame 1's newly-allocated SSBO would
contain garbage/zeros and never get populated — causing animated
M2 instances to flash invisible on alternating frames.
Fix: Make bonesDirty per-frame-index (bool[2]) so each buffer
independently tracks whether it needs bone data uploaded. When
bones are recomputed, both indices are marked dirty. When uploaded
during render, only the current frame index is cleared. New buffer
allocations in prepareRender force their frame index dirty.
- Overlap M2 and character animation updates via std::async (~2-5ms saved)
- Thread-local collision scratch buffers for concurrent floor queries
- Parallel terrain/WMO/M2 floor queries in camera controller
- Seed new M2 instance bones from existing siblings to eliminate pop-in flash
- Fix shadow flicker: snap center along stable light axes instead of in view space
- Increase shadow distance default to 300 units (slider max 500)
Move CPU-heavy BLP texture decoding from main thread to background worker
threads for all hot paths: terrain M2 models, WMO doodad M2s, WMO textures,
creature models, and gameobject WMOs. Each renderer (M2, WMO, Character) now
accepts a pre-decoded BLP cache that loadTexture() checks before falling back
to synchronous decode.
Defer WMO normal/height map generation (3 per-pixel passes: luminance, box
blur, Sobel) during terrain streaming finalization — this was the dominant
remaining bottleneck after BLP pre-decoding.
Terrain streaming stalls: 1576ms → 124ms worst case.
Terrain finalization was uploading all 256 chunks (GPU fence waits) in one
atomic advanceFinalization call that couldn't be interrupted by the 5ms time
budget. Now split into incremental batches of 16 chunks per call, allowing
the time budget to yield between batches.
M2 instance creation had O(N) dedup scans iterating ALL instances to check
for duplicates. In cities with 5000+ doodads, this caused O(N²) total work
during tile loading. Replaced with hash-based DedupKey map for O(1) lookups.
Changes:
- TerrainRenderer::loadTerrainIncremental: uploads N chunks per call
- FinalizingTile tracks terrainChunkNext for cross-frame progress
- TERRAIN phase yields after preload and after each chunk batch
- M2Renderer::DedupKey hash map replaces linear scan in createInstance
and createInstanceWithMatrix
- Dedup map maintained through rebuildSpatialIndex and clear paths
- Add magma/slime rendering path to water shader (fbm noise, crust/molten/core coloring)
- Fix WMO liquid height filter rejecting high-altitude zones like Ironforge (Z>300)
- Allow interior WMO magma/slime MLIQ groups to load (skip only water/ocean)
- Mark LAVASTEAM.m2 as spell effect for proper additive blend, hide emission mesh
- Add isLavaModel flag for M2 ForgeLava/LavaPots UV scroll fallback
- Add isLava material detection in WMO renderer for lava texture UV animation
- Fix WMO material UBO colors for magma (was blue, now orange-red)
WMO interior doodads (gears, decorations) were blocking player movement
via M2 collision. Skip collision for all WMO doodad M2 instances since
the WMO itself handles wall collision.
Also filter WMO wall collision using MOPY per-triangle flags: only
rendered+collidable triangles block the player, skipping invisible
collision hulls.
Revert tram portal extended range (no longer needed with collision fix).
Use cached model flags (isValid, isSmoke, isInvisibleTrap, isGroundDetail,
disableAnimation, boundRadius) on M2Instance instead of models.find() in
the hot culling paths. Also complete cached flag initialization in
createInstanceWithMatrix().
- Glow sprites now use dedicated vertex buffer (glowVB_) separate from
M2 particle buffer to prevent data race when renderM2Particles()
overwrites glow data mid-flight
- Move fadeAlpha from shared material UBO to per-draw push constants,
eliminating cross-instance alpha race on non-double-buffered UBOs
- Smooth adaptive render distance transitions to prevent pop-in/out
at instance count thresholds (1000/2000)
- Distance-tiered character bone throttling: near (<30u) every frame,
mid (30-60u) every 3rd, far (60-120u) every 6th frame
- Skip weapon instance animation updates (transforms set by parent bones)
- Split M2 instances into fast-path index lists (animated, particle-only,
particle-all, smoke) to avoid iterating all 46K instances per frame
- Cache model flags (hasAnimation, disableAnimation, isSmoke, etc.) on
M2Instance struct to eliminate per-frame hash lookups
- Replace full rebuildSpatialIndex on position/transform updates with
incremental grid cell remove+add, preventing 8.5ms/frame rebuild cost
- Advance animTime for all instances (texture UV animation) but only
compute bones and particles for the ~3K that need it
M2_UPDATE: 10.7ms → 2.0ms, FPS: 35 → 55-59
Reset descriptor pools in CharacterRenderer/M2Renderer/WMORenderer on map
change to prevent VK_ERROR_DEVICE_LOST from pool exhaustion. Defer re-entrant
SMSG_NEW_WORLD during active world load to avoid recursive cleanup crashes.
Gate swim bubbles on swimming state, skip redundant shadow pipeline re-init,
add WOWEE_SKIP_* env vars for render isolation debugging.
StormLib's bundled libtomcrypt uses x86 inline assembly (bswapl/movl)
gated by __MINGW32__, which is defined on CLANGARM64 too. Pass
-DLTC_NO_BSWAP to force portable C byte-swap fallback.
Replace monolithic finalizeTile() with a phased state machine that spreads
GPU upload work across multiple frames (TERRAIN→M2→WMO→WATER→AMBIENT→DONE).
Each advanceFinalization() call does one bounded unit of work within the
per-frame time budget, eliminating 50-300ms frame hitches when entering cities.
Additional performance improvements:
- Pre-allocate bone SSBOs at M2 instance creation instead of lazily during
first render frame, preventing hitches when many skinned characters appear
- Enable WMO distance culling (800 units) with active-group exemption so
the player's current floor/neighbors are never culled
- Add 4-tier adaptive M2 render distance (250/400/600/1000 based on count)
- Remove dead PendingM2Upload queue code superseded by incremental system
Fix tile re-enqueueing bug: keep tiles in pendingTiles until committed to
loadedTiles (not when moved to finalizingTiles_) so streamTiles() doesn't
re-enqueue tiles mid-finalization. Also handle unloadTile() for tiles in
the finalizingTiles_ deque to prevent orphaned water/M2/WMO resources.
- Spawn dark point-sprite insects buzzing around cattails/reeds/kelp/seaweed
- Fix firefly M2 particles: exempt from alpha dampening and forced gravity
- Make water shoreline/crest foam more irregular with UV warping and bluer tint
- Cache material UBO mapped pointers at creation time, eliminating
per-batch vmaGetAllocationInfo() calls in the hot render path
- Precompute foliage/elven/lantern/kobold model name classifications
at load time instead of per-instance string operations every frame
- Remove redundant descriptor set and push constant rebinds on WMO
pipeline switches (preserved across compatible layouts)
- Pre-allocate glow sprite descriptor set once at init instead of
allocating from the pool every frame
Shadow casting: foliage batches now bind their actual texture in the shadow
pass with alpha testing, producing leaf-shaped shadows instead of solid cards.
Uses a per-frame resettable descriptor pool for texture sets.
Shadow receiving: foliage fragments now sample the shadow map with PCF
instead of using a flat constant darkening.
Gameobject M2 instances (books, crates, chests) were continuously
cycling their animations because M2Renderer unconditionally loops
all sequences. Added setInstanceAnimationFrozen() and freeze all
gameobject instances at creation time so they stay in their bind pose.
- keep shadow projection center fixed while moving to remove per-frame projection churn flicker
- replace delayed post-move catch-up with immediate stop transition and idle smoothing
- rework foliage shadow caster motion to use blended phase-shifted UV samples for continuous position transitions
- reduce high-frequency foliage threshold popping by removing threshold warping path
- sharpen terrain receive filtering with tuned 5-tap PCF weights/offset for more detailed shadows
- raise shadow map resolution to 1536 and keep light-space texel snapping for stable sampling
- set shadows enabled by default and lower global shadow strength from 0.65 to 0.62
- keep foliage animation speed consistent between moving and idle at 80%
- reduce per-tile ground clutter generation pressure and enforce tighter caps to avoid spikes
- remove expensive detail dedupe scans from the hot render path
- add progressive/lazy clutter updates around player movement to smooth frame pacing
- lower noisy runtime INFO logging to DEBUG/throttled paths
- keep terrain/game screen updates responsive while preserving existing behavior
Replace 2D screen-space ding rings with real WoW LevelUp.m2 particle/geometry
effect. Fix FBlock particle color parsing (C3Vector floats, not CImVector bytes)
which was producing blue/red instead of golden yellow. Spell effect models bypass
particle dampeners, glow sprite conversion, Mod→Additive blend override, and all
collision (floor/wall/camera) to prevent camera zoom-in. Other players' level-ups
trigger the 3D effect at their position with group chat notification. F7 hotkey
for testing.
Both passes were rendering the entire loaded scene (17×17 tile radius)
into a shadow map that only covers 360×360 world units — submitting
10-50× more geometry than the shadow frustum can actually use.
- TerrainRenderer::renderShadow: skip chunks whose bounding sphere
doesn't overlap the shadow frustum AABB in XY. Reduces terrain draw
calls from O(all loaded chunks) to O(chunks within ~180 units).
- M2Renderer::renderShadow: skip instances whose world AABB doesn't
overlap the shadow frustum in XY. Reduces M2 draw calls similarly.
- Both functions now take shadowCenter + halfExtent parameters.
Batches whose named texture fails to load now render invisible instead of
white (the swampreeds01a.blp case causing a white shell around aquatic plants).
Also implements proper M2 opacity plumbing:
- Parse texture weight tracks (M2Track<fixed16>) and color animation alpha
tracks (M2Color.alpha) to resolve per-batch opacity at load time
- Skip batches with batchOpacity < 0.01 in the render loop
- Apply M2Texture.flags (bit0=WrapS, bit1=WrapT) to GL sampler wrap mode
- Upload both UV sets (texCoords[0] and texCoords[1]) and select via
textureUnit uniform, so batches referencing UV set 1 render correctly
Added slope normal checking to reject surfaces too steep to walk.
Prevents character/mount from clipping through steep terrain.
Changes:
- Added MIN_WALKABLE_NORMAL threshold (0.7 = ~45° max slope)
- WMO collision: query surface normal, reject if normalZ < 0.7
- M2 collision: query surface normal, reject if normalZ < 0.7
- Updated M2Renderer::getFloorHeight to output surface normal
- M2 already had internal 0.35 check (~70°), new 0.7 is more restrictive
Steep slopes now block movement instead of allowing clipping.
Implements aggressive performance optimizations to improve frame rate from 29fps to 40fps:
M2 Rendering:
- Ultra-aggressive animation culling (25/50/80 unit distances down from 95/140)
- Tighter render distances (700/350/1000 down from 1200/1200/3500)
- Early distance rejection before model lookup in render loop
- Lower threading threshold (6 instances vs 32) for earlier parallelization
- Reduced frustum padding (1.5x vs 2.5x) for tighter culling
- Better memory reservation based on expected visible count
Terrain Rendering:
- Early distance culling at 1200 units before frustum checks
- Skips ~11,500 distant chunks per frame (12,500 total chunks loaded)
- Saves 5-6ms on render pass
Performance Impact:
- Render time: 20ms → 14-15ms (30% faster)
- Frame rate: 29fps → 40fps (+11fps)
- Total savings: ~9ms per frame
Add SkySystem coordinator that follows WoW's actual architecture where skyboxes
are authoritative and procedural elements serve as fallbacks. Integrate lighting
system across all renderers (terrain, WMO, M2, character) with unified parameters.
Sky System:
- SkySystem coordinator manages skybox, celestial bodies, stars, clouds, lens flare
- Skybox is authoritative (baked stars from M2 models, procedural fallback only)
- skyboxHasStars flag gates procedural star rendering (prevents double-star bug)
Celestial Bodies (Lore-Accurate):
- Two moons: White Lady (30-day cycle, pale white) + Blue Child (27-day cycle, pale blue)
- Deterministic moon phases from server gameTime (not deltaTime toys)
- Sun positioning driven by LightingManager directionalDir (DBC-sourced)
- Camera-locked sky dome (translation ignored, rotation applied)
Lighting Integration:
- Apply LightingManager params to WMO, M2, character renderers
- Unified lighting: directional light, diffuse color, ambient color, fog
- Star occlusion by cloud density (70% weight) and fog density (30% weight)
Documentation:
- Add comprehensive SKY_SYSTEM.md technical guide
- Update MEMORY.md with sky system architecture and anti-patterns
- Update README.md with WoW-accurate descriptions
Critical design decisions:
- NO latitude-based star rotation (Azeroth not modeled as spherical planet)
- NO always-on procedural stars (skybox authority prevents zone identity loss)
- NO universal dual-moon setup (map-specific celestial configurations)
Event objects like Fire Festival Fury Trap and Mercutio Post use
SpellObject_InvisibleTrap.m2 models which were rendering as white
tiles using WHITE1.BLP texture. These are meant to be invisible
spell trigger objects that should not obstruct player movement.
Changes:
- Added isInvisibleTrap flag to M2ModelGPU struct
- Detect models with "invisibletrap" in name during loading
- Skip rendering invisible trap instances in render loop
- Disable all collision checks (floor/wall/occlusion) for invisible traps
- Objects remain functional for spell casting but are now invisible
Major improvements:
- Load TaxiPathNode.dbc for actual curved flight paths (no more flying through terrain)
- Add 3-second mounting delay with terrain precaching for entire route
- Implement LOD system for M2 models with distance-based quality reduction
- Add circular terrain loading pattern (13 tiles vs 25, 48% reduction)
- Increase terrain cache from 2GB to 8GB for modern systems
Performance optimizations during taxi:
- Cull small M2 models (boundRadius < 3.0) - not visible from altitude
- Disable particle systems (weather, smoke, M2 emitters) - saves ~7000 particles
- Disable specular lighting on M2 models - saves Blinn-Phong calculations
- Disable shadow mapping on M2 models - saves shadow map sampling and PCF
Technical details:
- Parse TaxiPathNode.dbc spline waypoints for curved paths around terrain
- Build full path from node pairs using TaxiPathEdge lookup
- Precache callback triggers during mounting delay for smooth takeoff
- Circular tile loading uses Euclidean distance check (dx²+dy² <= r²)
- LOD fallback to base mesh when higher LODs unavailable
Result: Buttery smooth taxi flights with no terrain clipping or performance hitches
Parse bounding vertices, triangles, and normals from M2 files and use
them for proper triangle-level collision instead of AABB heuristics.
Spatial grid bucketing for efficient queries, closest-point wall push
with soft clamping, and ray-triangle floor detection alongside existing
AABB fallback.