- Mark swapchain dirty in Application's SDL resize handler (was only done
in Window::pollEvents which is never called)
- Skip swapchain recreation when window is minimized (0×0 extent violates
Vulkan spec and crashes vmaCreateImage)
- Guard aspect ratio division by zero when height is 0
Signed-off-by: Pavel Okhlopkov <pavel.okhlopkov@flant.com>
Add complete spell visual pipeline resolving the DBC chain
(Spell → SpellVisual → SpellVisualKit → SpellVisualEffectName → M2)
with precast/cast/impact phases, bone-attached positioning, and
automatic dual-hand mirroring.
Ribbon rendering fixes:
- Parse visibility track as uint8 (was read as float, suppressing
all ribbon edges due to ~1.4e-45 failing the >0.5 check)
- Filter garbage emitters with bone=UINT_MAX unconditionally
- Guard against NaN spine positions from corrupt bone data
- Resolve ribbon textures via direct index, not textureLookup table
- Fall back to bone 0 when ribbon bone index is out of range
Particle rendering fixes:
- Reduce spell particle scale from 5x to 1.5x (was oversized)
- Exempt spell effect instances from position-based deduplication
Spell handler integration:
- Trigger precast visuals on SMSG_SPELL_START with server castTimeMs
- Trigger cast/impact visuals on SMSG_SPELL_GO
- Cancel precast visuals on cast interrupt/failure/movement
M2 classifier expansion:
- Add AmbientEmitterType enum for sound system integration
- Add 20+ foliage tokens, 4 spell effect tokens, isSmallFoliage flag
- Add markModelAsSpellEffect() to override disableAnimation
DBC layouts:
- Add SpellVisualID field to Spell.dbc for all expansion configs
Signed-off-by: Pavel Okhlopkov <pavel.okhlopkov@flant.com>
HiZ occlusion culling built ~11 mip levels with per-level barriers
behind a blocking vkWaitForFences every frame — the main frame-rate
bottleneck. Disable the pyramid build and fall back to GPU frustum-
only culling which is nearly free behind the fence.
WMO interiors now receive full-strength directional shadows but clamp
minimum brightness at 0.45 with a 0.35 ambient floor, so interiors
get real shadow contrast without going too dark.
All M2 pipelines used VK_CULL_MODE_NONE, so back-facing polygons always
rendered. On NPCs whose torso meshes are single-layer geometry this
made the interior cavity visible through the back.
Create backface-culled pipeline variants (VK_CULL_MODE_BACK_BIT) and
select them at draw time unless the material has the TwoSided flag
(0x04). Foliage/ground-detail forceCutout batches and the shadow
pipeline keep VK_CULL_MODE_NONE since those cards are inherently
two-sided.
M2Renderer::clear() reset descriptor pools but left the texture cache
and failed-texture tracking intact. On re-login the stale cache filled
the budget, failedTextureRetryAt_ blocked reloads, and the client
entered an infinite model-load loop. Match the cleanup already done in
shutdown() and CharacterRenderer::clear().
Implement GPU-driven Hierarchical-Z occlusion culling for M2 doodads
using a depth pyramid built from the previous frame's depth buffer.
The cull shader projects bounding spheres via prevViewProj (temporal
reprojection) and samples the HiZ pyramid to reject hidden objects
before the main render pass.
Key implementation details:
- Separate early compute submission (beginSingleTimeCommands + fence
wait) eliminates 2-frame visibility staleness
- Conservative safeguards prevent false culls: screen-edge guard,
full VP row-vector AABB projection (Cauchy-Schwarz), 50% sphere
inflation, depth bias, mip+1, min screen size threshold, camera
motion dampening (auto-disable on fast rotations), and per-instance
previouslyVisible flag tracking
- Graceful fallback to frustum-only culling if HiZ init fails
Fix dark WMO interiors by gating shadow map sampling on isInterior==0
in the WMO fragment shader. Interior groups (flag 0x2000) now rely
solely on pre-baked MOCV vertex-color lighting + MOHD ambient color.
Disable interiorDarken globally (was incorrectly darkening outdoor M2s
when camera was inside a WMO). Use isInsideInteriorWMO() instead of
isInsideWMO() for correct indoor detection.
New files:
- hiz_system.hpp/cpp: pyramid image management, compute pipeline,
descriptors, mip-chain build dispatch, resize handling
- hiz_build.comp.glsl: MAX-depth 2x2 reduction compute shader
- m2_cull_hiz.comp.glsl: frustum + HiZ occlusion cull compute shader
- test_indoor_shadows.cpp: 14 unit tests for shadow/interior contracts
Modified:
- CullUniformsGPU expanded 128->272 bytes (HiZ params, viewProj,
prevViewProj)
- Depth buffer images gain VK_IMAGE_USAGE_SAMPLED_BIT for HiZ reads
- wmo.frag.glsl: interior branch before unlit, shadow skip for 0x2000
- Render graph: hiz_build + compute_cull disabled (run in early compute)
- .gitignore: ignore compiled .spv binaries
- MEGA_BONE_MAX_INSTANCES: 2048 -> 4096
Signed-off-by: Pavel Okhlopkov <pavel.okhlopkov@flant.com>
Demote 44 more LOG_WARNING messages to LOG_DEBUG: warden module chunk
progress, entire shutdown/teardown sequence, transport manager connect,
inventory right-click slot, and warden handshake diagnostics. Keeps real
warnings (texture not found, slow handlers, stalls, integrity hash)
visible in the log.
When spline parsing consumes the wrong number of bytes, the subsequent
blockCount read lands on garbage data (e.g. 71 instead of ~5 for UNIT).
Previously the parser logged a warning but continued, reading garbage
mask/field data until hitting truncation. Now it returns false for
CREATE_OBJECT blocks with suspicious counts, letting the block loop
skip cleanly to the next entity.
Also downgrade ~44 diagnostic LOG_WARNING messages to LOG_DEBUG across
17 files (equipment, transport, DBC, heartbeat, chat, GO raypick, etc.)
to reduce log noise and make real warnings visible.
Resolve conflicts:
- audio_callback_handler.cpp: keep PR's animation_controller include
- movement_handler.cpp: use PR accessors with master's transportResolved logic
- world_packets.cpp: keep PR's decomposed version (functions moved to split files)
Apply overkill field fix to world_packets_entity.cpp (WotLK
SMSG_ATTACKERSTATEUPDATE missing uint32 overkill between damage and
subDamageCount).
If the capability probe ran before the model was fully loaded, all melee
animation IDs would be 0 and auto-attack swings would silently fall back
to STAND (no visible animation). Now re-probes when a melee swing fires
but hasMelee is false.
Added WARNING-level logging to triggerMeleeSwing and CombatFSM to
diagnose the night elf stationary combat animation issue.
Emote animations: fix DBC chain for /laugh, /flirt, /sleep, /fart, /stink.
Previously all emotes with AnimID=0 used emoteRef as animId (wrong DBC
record IDs). Now resolves through Emotes.dbc properly, with per-emote
overrides for emotes whose DBC chain yields 0. Adds Emotes.dbc load
failure warning and diagnostic logging.
WMO culling: skip portal culling when camera is outside all groups (fixes
vanishing Stormwind ground tiles). Also handle indoor/outdoor AABB overlap
by showing all groups when position is in both indoor and outdoor AABBs.
Transport: clear ONTRANSPORT flag and transport state when transport not
found, preventing stale transport data from teleporting player to map
origin. Add area trigger safety net near (0,0,0) on Eastern Kingdoms.
MAX_CULL_INSTANCES was 16384 but game object instances were allocated
at indices 20000+, beyond the cull buffer. Increased to 65536.
Also compute fallback boundRadius from vertex extents when M2 header
reports 0, and seed bones for global-sequence-only animated models.
M2 GPU instancing
- M2InstanceGPU SSBO (96 B/entry, double-buffered, 16384 max)
- Group opaque instances by (modelId, LOD); single vkCmdDrawIndexed per group
- boneBase field indexes into mega bone SSBO via gl_InstanceIndex
Indirect terrain drawing
- 24 MB mega index buffer (6M uint32) + 64 MB mega vertex buffer
- CPU builds VkDrawIndexedIndirectCommand per visible chunk
- Single VB/IB bind per frame; shadow pass reuses mega buffers
- Replaced vkCmdDrawIndexedIndirect with direct vkCmdDrawIndexed to fix
host-mapped buffer race condition that caused terrain flickering
GPU frustum culling (compute shader)
- m2_cull.comp.glsl: 64-thread workgroups, sphere-vs-6-planes + distance cull
- CullInstanceGPU SSBO input, uint visibility[] output, double-buffered
- dispatchCullCompute() runs before main pass via render graph node
Consolidated bone matrix SSBOs
- 16 MB double-buffered mega bone SSBO (2048 instances × 128 bones)
- Eliminated per-instance descriptor sets; one megaBoneSet_ per frame
- prepareRender() packs bone matrices consecutively into current frame slot
Render graph / frame graph
- RenderGraph: RGResource handles, RGPass nodes, Kahn topological sort
- Automatic VkImageMemoryBarrier/VkBufferMemoryBarrier between passes
- Passes: minimap_composite, worldmap_composite, preview_composite,
shadow_pass, reflection_pass, compute_cull
- beginFrame() uses buildFrameGraph() + renderGraph_->execute(cmd)
Pipeline derivatives
- PipelineBuilder::setFlags/setBasePipeline for VK_PIPELINE_CREATE_DERIVATIVE_BIT
- M2 opaque = base; alphaTest/alpha/additive are derivatives
- Applied to terrain (wireframe) and WMO (alpha-test) renderers
Rendering bug fixes:
- fix(shadow): compute lightSpaceMatrix before updatePerFrameUBO to eliminate
one-frame lag that caused shadow trails and flicker on moving objects
- fix(shadow): scale depth bias with shadowDistance_ instead of hardcoded 0.8f
to prevent acne at close range and gaps at far range
- fix(visibility): WMO group distance threshold 500u → 1200u to match terrain
view distance; buildings were disappearing on the horizon
- fix(precision): camera near plane 0.05 → 0.5 (ratio 600K:1 → 60K:1),
eliminating Z-fighting and improving frustum plane extraction stability
- fix(streaming): terrain load radius 4 → 6 tiles (~2133u → ~3200u) to exceed
M2 render distance (2800u) and eliminate pop-in when camera turns;
unload radius 7 → 9; spawn radius 3 → 4
- fix(visibility): ground-detail M2 distance multiplier 0.75 → 0.9 to reduce
early pop of grass and debris
- Enable alpha-to-coverage on alphaTestPipeline for smooth hair edges
when MSAA is active (both init and recreatePipelines paths)
- Shader uses fwidth()-based alpha rescaling for clean coverage
- Skip group 17/18 geosets (DK/NE eye glow) when no geoset filter is
set — prevents blue eye glow on all NPCs
- Add missing <libgen.h> include for dirname() on Linux
- alphaTestPipeline_ uses blendDisabled() so surviving pixels are fully
opaque (was blendAlpha, causing hair to blend with background)
- Remove alphaCutout from alphaTest condition — opaque materials like
capes no longer alpha-test just because their texture has an alpha
channel
- Two-pass batch rendering: opaque (blendMode 0) draws first to
establish depth, then alpha-key/blend draws on top
Bone SSBO buffers were allocated for MAX_BONES (240) entries but only
the first numBones were written. Uninitialized GPU memory in the
remaining slots caused vertex spikes when any bone index exceeded the
model's actual bone count.
Also send MSG_MOVE_STOP before spell casts so the server doesn't reject
cast-time spells (e.g. hearthstone) with "can't do that while moving".
Global sequence bones (hair, cape, physics) need time values spanning
their full duration (up to ~968733ms), but animationTime wraps at the
current animation's sequence duration (~2000ms for walk). This caused
vertex spikes projecting from fingers/neck/ponytail as bones got stuck
in the first ~2s of their loop. Add a separate globalSequenceTime
accumulator that is not wrapped at the animation duration.
Hair, cape, and other physics bones use global sequences (continuously
looping timers independent of the character's current animation). The
character renderer was ignoring globalSequence entirely, causing these
bones to fall back to identity transforms and produce deformed/spiked
hair geometry. Added resolveTrackTime() to wrap global sequence time
correctly, matching the M2 renderer's existing behavior.
M2 vertex bone indices are indices into boneLookupTable, not direct bone
array indices. Without remapping, vertices weighted to higher bone
indices (cloak, cape, hair) get the wrong bone transform, causing
vertices to project wildly outward from the character.
Wait on in-flight upload batch fences at the start of each frame and
insert a memory barrier (transfer→fragment shader) so the graphics
queue sees completed layout transitions from the transfer queue.
Fixes VK_IMAGE_LAYOUT_UNDEFINED validation errors for freshly loaded
textures.
Revert static JSON layout changes (15-22 back to 14-21) since WotLK
loads the Classic 23-field DBC. Add getItemDisplayInfoTextureFields()
helper that detects field count at runtime and adjusts the texture
base index accordingly (14 for 23-field, 15 for 25-field).
When saved settings loaded MSAA before FSR2 on startup, the pending MSAA
change (e.g. 8x) was queued before FSR2 was enabled. Since FSR2 checked
the current MSAA (still 1x), it didn't override the pending change.
On the next frame, applyMsaaChange created a 4-attachment MSAA render pass,
then FSR2 lazily created a 2-attachment framebuffer against it — SIGSEGV.
Add guards in both applyMsaaChange (force 1x if FSR2 is blocking) and
setFSR2Enabled (override any pending MSAA >1x when enabling FSR2).
AMD RADV validation flagged missing shaderStorageImageWriteWithoutFormat,
shaderInt16, shaderFloat16, and deviceCoherentMemory. The first two are
now required device features; shaderFloat16 is optionally enabled via
Vulkan 1.2 feature query; AMD device coherent memory extension and
feature are enabled when available to prevent VMA memory type errors.
executePostProcessing() only set inlineMode=true in the FSR2 path.
The FXAA and FSR1 paths both start INLINE render passes but returned
false, causing endFrame() to record ImGui into a secondary command
buffer and execute it inside the INLINE pass — validation errors and
UI disappearing on AMD RADV when FSR1+MSAA are both enabled.
Instance was created with Vulkan 1.1 but depthResolveSupported_ was gated
on the physical device's API version (1.2+ on RADV). This caused
vkCreateRenderPass2 (core 1.2) to dispatch through a null function pointer
when MSAA was enabled. Now requests 1.2 instance with 1.1 minimum fallback
and gates depth resolve on the actual instance API version. Also removes
all diagnostic crash-phase instrumentation from the previous investigation.
AMD crash is caused by msaaChangePending_ flipping true from saved settings.
applyMsaaChange() then crashes with faultAddr=(nil). Add LOG_WARNING markers
between pipeline recreation groups to identify the failing call.
Add per-frame LOG_WARNING with this/vkCtx/camera/postProcess pointers and
msaaChangePending state. If the log prints before crash, the pointer values
tell us what's corrupt. If it doesn't print, crash is in the log itself
(meaning this or vkCtx is corrupt).
Previous markers showed crash before bf:ubo. Add markers at bf:msaa,
bf:pp, bf:swap, bf:acquire, bf:jitter to isolate which early beginFrame
call does the NULL deref.
AMD RADV crash is renderPhase=beginFrame with faultAddr=(nil) — a NULL
pointer dereference somewhere in the pre-pass chain. Add granular markers
(bf:ubo, bf:minimap, bf:worldmap, bf:preview, bf:shadow, bf:reflection,
bf:renderpass) to pinpoint the exact call.
Add signal-safe render-phase markers throughout GameScreen::render() and
Application::render() so the crash handler can report which render call was
active when a SIGSEGV occurs. The AMD RADV crash backtrace only shows 2
frames due to missing frame pointers, making it impossible to identify the
actual crash site.
Changes:
- Add volatile g_crashRenderPhase marker updated before each major render call
- Upgrade Linux signal handler to sigaction with SA_SIGINFO for faulting address
- Set ImGui CheckVkResultFn to log silent Vulkan errors in ImGui backend
- Enable -fno-omit-frame-pointer in all build configs (not just Debug/RelWithDebInfo)
destroyInstanceBones loops over both frame slots (i=0,1), deferring
both boneSet[0] and boneSet[1] to the current frame's fence. When
currentFrame=0, boneSet[1] is freed after only slot 0's fence completes
while slot 1's command buffer may still be using it.
Switch both M2Renderer and CharacterRenderer bone destruction from
deferAfterFrameFence to deferAfterAllFrameFences to ensure all
in-flight frames have completed before freeing cross-slot resources.
deferAfterFrameFence only waits for one frame slot's fence, but shared
resources (material descriptor sets, vertex/index buffers) are bound by
both in-flight frames' command buffers. On AMD RADV this caused
vkFreeDescriptorSets errors and eventual SIGSEGV.
Add deferAfterAllFrameFences: queues to every frame slot with a shared
counter so cleanup runs exactly once, after the last slot is fenced.
Use it for WMO, terrain, water, and character model shared resources.
Per-frame bone sets keep using deferAfterFrameFence (already correct).
Also fix character renderer vertex format: R8G8B8A8_UINT -> _SINT to
match shader's ivec4 input (RADV validation rejects the mismatch).
CharacterRenderer::destroyModelGPU now defers vertex/index buffer
destruction when replacing models mid-stream, preventing use-after-free
on AMD RADV. FXAA descriptor sets are now per-frame to eliminate
write-read races between in-flight command buffers. Water reflection
descriptor update narrowed to current frame only.
CharacterRenderer::destroyInstanceBones had the same immediate-free bug
as M2Renderer — freeing bone descriptor sets and buffers while in-flight
command buffers still reference them. Applies the same deferred pattern
via deferAfterFrameFence for the removeInstance streaming path.
M2 destroyInstanceBones and WMO destroyGroupGPU freed descriptor sets
and buffers immediately during tile streaming, while in-flight command
buffers still referenced them — causing DEVICE_LOST on AMD RADV.
Now defers GPU resource destruction via deferAfterFrameFence in streaming
paths (removeInstance, removeInstances, unloadModel). Immediate
destruction preserved for shutdown/clear paths that vkDeviceWaitIdle
first.
Also: vkDeviceWaitIdle before WMO backfillNormalMaps descriptor rebinds,
and fillModeNonSolid added to required device features for wireframe
pipelines on AMD.