Commit graph

182 commits

Author SHA1 Message Date
Kelsi
1c7b87ee78 Remove FSR3 wrapper path and keep official Path-A runtime only 2026-03-09 04:33:05 -07:00
Kelsi
9ff9f2f1f1 Fix cross-platform FSR3 compile path and Path-A runtime wiring 2026-03-09 04:24:24 -07:00
Kelsi
78fa10c6ba Gate bridge interop export by wrapper capability bits
Some checks are pending
Build / Build (arm64) (push) Waiting to run
Build / Build (x86-64) (push) Waiting to run
Build / Build (macOS arm64) (push) Waiting to run
Build / Build (windows-arm64) (push) Waiting to run
Build / Build (windows-x86-64) (push) Waiting to run
Security / CodeQL (C/C++) (push) Waiting to run
Security / Semgrep (push) Waiting to run
Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run
2026-03-09 02:45:37 -07:00
Kelsi
27261303d2 Only export bridge interop handles for dx12_bridge backend 2026-03-09 02:35:02 -07:00
Kelsi
faec3f6ec2 Enable Linux bridge mode and Vulkan FD interop exports 2026-03-09 02:28:49 -07:00
Kelsi
45c2ed7a64 Expose wrapper backend mode in runtime diagnostics 2026-03-09 02:15:55 -07:00
Kelsi
1bcc2c6b85 Use monotonic interop fence values for FSR3 bridge dispatch 2026-03-09 02:03:03 -07:00
Kelsi
1c7908f02d Add ABI v3 fence-value sync for DX12 bridge dispatch 2026-03-09 01:58:45 -07:00
Kelsi
94a441e234 Advance DX12 bridge dispatch path and guard Win32 Vulkan handle exports 2026-03-09 01:49:18 -07:00
Kelsi
d06d1df873 Wire Win32 semaphore handles into FSR3 wrapper bridge payload 2026-03-09 01:33:39 -07:00
Kelsi
61cb2df400 Add ABI v2 external-handle plumbing for FSR3 bridge dispatch 2026-03-09 01:31:01 -07:00
Kelsi
f08b6fd4c2 Expose FSR3 dispatch failure reasons in runtime and HUD 2026-03-09 01:15:49 -07:00
Kelsi
93850ac6dc Add Path A/B/C FSR3 runtime detection with clear FG fallback status 2026-03-09 00:01:45 -07:00
Kelsi
5ad4b9be2d Add FSR3 frame generation runtime stats to performance HUD 2026-03-08 23:35:39 -07:00
Kelsi
e600f003ca Fix FG disable state reset and FSR3 context compatibility for SDK variants 2026-03-08 23:32:02 -07:00
Kelsi
aa43aa6fc8 Bridge FSR3 Vulkan framegen dispatch and route sharpen to interpolated output 2026-03-08 23:20:50 -07:00
Kelsi
538a1db866 Fix FSR3 runtime wrapper for local SDK API and real Vulkan resource dispatch 2026-03-08 23:13:08 -07:00
Kelsi
f1099f5940 Add FSR3 runtime library probing and readiness status 2026-03-08 23:03:45 -07:00
Kelsi
e1c93c47be Stage FSR3 framegen dispatch hook in frame pipeline 2026-03-08 22:57:35 -07:00
Kelsi
bdfec103ac Add persisted AMD FSR3 framegen runtime toggle plumbing 2026-03-08 22:53:21 -07:00
Kelsi
f6fce0f19a Fix FSR2 jitter mismatch between projection and dispatch 2026-03-08 21:10:36 -07:00
Kelsi
2e71c768db Add live FSR2 motion/jitter tuning controls and HUD readout 2026-03-08 20:56:22 -07:00
Kelsi
96f7728227 Add AMD FSR2 CI build job and adjust jitter offset sign 2026-03-08 20:27:39 -07:00
Kelsi
a12126cc7e Persist upscaling mode and refine FSR2 jitter behavior 2026-03-08 20:22:11 -07:00
Kelsi
6fd1c94d99 Tune AMD FSR2 motion vectors to reduce residual jitter 2026-03-08 20:15:54 -07:00
Kelsi
0a88406a3d Reduce FSR2 jitter by preserving temporal history 2026-03-08 20:07:01 -07:00
Kelsi
51a8cf565f Integrate AMD FSR2 backend and document SDK bootstrap 2026-03-08 19:56:52 -07:00
Kelsi
a24ff375fb Add AMD FSR2 SDK detection and backend integration scaffolding 2026-03-08 19:33:07 -07:00
Kelsi
e2a2316038 Stabilize FSR2 path and refine temporal pipeline groundwork 2026-03-08 18:52:04 -07:00
Kelsi
c3047c33ba FSR2: fix motion vector jitter, add bicubic anti-ringing, depth-dilated MVs
- Motion shader: unjitter NDC before reprojection (ndc+jitter, not ndc-jitter),
  compute motion against unjittered UV so static scenes produce zero motion
- Pass jitter offset to motion shader (push constant 80→96 bytes)
- Accumulate shader: restore Catmull-Rom bicubic with anti-ringing clamp to
  prevent negative-lobe halos at edges while maintaining sharpness
- Add depth-dilated motion vectors (3x3 nearest-to-camera) to prevent
  background MVs bleeding over foreground edges
- Widen neighborhood clamp gamma to 3.0, uniform 5% blend with
  disocclusion/velocity reactive boosting
2026-03-08 14:18:00 -07:00
Kelsi
e94eb7f2d1 FSR2 temporal upscaling fixes: unjittered reprojection, sharpen Y-flip, MSAA guard, descriptor double-buffering
Some checks are pending
Build / Build (arm64) (push) Waiting to run
Build / Build (x86-64) (push) Waiting to run
Build / Build (macOS arm64) (push) Waiting to run
Build / Build (windows-arm64) (push) Waiting to run
Build / Build (windows-x86-64) (push) Waiting to run
Security / CodeQL (C/C++) (push) Waiting to run
Security / Semgrep (push) Waiting to run
Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run
- Motion vectors: single unjittered reprojection matrix (80 bytes) instead of
  two jittered matrices (160 bytes), eliminating numerical instability from
  jitter amplification through large world coordinates
- Sharpen pass: fix Y-flip for correct UV sampling, double-buffer descriptor
  sets to avoid race with in-flight command buffers
- MSAA: auto-disable when FSR2 enabled, grey out AA setting in UI
- Accumulation: variance-based neighborhood clamping in YCoCg space,
  correct history layout transitions
- Frame index: wrap at 256 for stable Halton sequence
2026-03-08 01:22:15 -08:00
Kelsi
52317d1edd Implement FSR 2.2 temporal upscaling
Full FSR 2.2 pipeline with depth-based motion vector reprojection,
temporal accumulation with YCoCg neighborhood clamping, and RCAS
contrast-adaptive sharpening.

Architecture (designed for FSR 3.x frame generation readiness):
- Camera: Halton(2,3) sub-pixel jitter with unjittered projection
  stored separately for motion vector computation
- Motion vectors: compute shader reconstructs world position from
  depth + inverse VP, reprojects with previous frame's VP
- Temporal accumulation: compute shader blends 5-10% current frame
  with 90-95% clamped history, adaptive blend for disocclusion
- History: ping-pong R16G16B16A16 buffers at display resolution
- Sharpening: RCAS fragment pass with contrast-adaptive weights

Integration:
- FSR2 replaces both FSR1 and MSAA when enabled
- Scene renders to internal resolution framebuffer (no MSAA)
- Compute passes run between scene and swapchain render passes
- Camera cut detection resets history on teleport
- Quality presets shared with FSR1 (0.50-0.77 scale factors)
- UI: "Upscaling" combo with Off/FSR 1.0/FSR 2.2 options
2026-03-07 23:13:01 -08:00
Kelsi
4cb03c38fe Parallel animation updates, thread-safe collision, M2 pop-in fix, shadow stabilization
- Overlap M2 and character animation updates via std::async (~2-5ms saved)
- Thread-local collision scratch buffers for concurrent floor queries
- Parallel terrain/WMO/M2 floor queries in camera controller
- Seed new M2 instance bones from existing siblings to eliminate pop-in flash
- Fix shadow flicker: snap center along stable light axes instead of in view space
- Increase shadow distance default to 300 units (slider max 500)
2026-03-07 22:29:06 -08:00
Kelsi
a4966e486f Fix WMO wall collision, normal mapping, POM backfill, and M2/WMO rendering performance
- Fix MOPY flag check (0x08 not 0x01) for proper wall collision detection
- Cap MAX_PUSH to PLAYER_RADIUS to prevent gradual clip-through
- Fix WMO doodad quaternion component ordering (X/Y swap)
- Linear normal map strength blend in shader for smooth slider control
- Enable shadow sampling for interior WMO groups (covered outdoor areas)
- Backfill deferred normal/height maps after streaming with descriptor rebind
- M2: prepareRender only iterates animated instances, bone dirty flag
- M2: remove worker thread VMA allocation, skip unready bone instances
- WMO: persistent visibility vectors, sequential culling
- Add FSR EASU/RCAS shaders
2026-03-07 22:03:28 -08:00
Kelsi
16c6c2b6a0 Raise diagnostic log thresholds to reduce log noise
SLOW update stages: 3ms → 50ms, renderer update: 5ms → 50ms,
loadModel/processAsync/spawnCreature: 3ms → 100ms,
terrain/camera: 3-5ms → 50ms. Remove per-frame spawn breakdown.
2026-03-07 18:43:13 -08:00
Kelsi
7ac990cff4 Background BLP texture pre-decoding + deferred WMO normal maps (12x streaming perf)
Move CPU-heavy BLP texture decoding from main thread to background worker
threads for all hot paths: terrain M2 models, WMO doodad M2s, WMO textures,
creature models, and gameobject WMOs. Each renderer (M2, WMO, Character) now
accepts a pre-decoded BLP cache that loadTexture() checks before falling back
to synchronous decode.

Defer WMO normal/height map generation (3 per-pixel passes: luminance, box
blur, Sobel) during terrain streaming finalization — this was the dominant
remaining bottleneck after BLP pre-decoding.

Terrain streaming stalls: 1576ms → 124ms worst case.
2026-03-07 15:46:56 -08:00
Kelsi
0313bd8692 Performance: ring buffer UBOs, batched load screen uploads, background world preloader
- Replace per-frame VMA alloc/free of material UBOs with a ring buffer in
  CharacterRenderer (~500 allocations/frame eliminated)
- Batch all ready terrain tiles into a single GPU upload during load screen
  (processAllReadyTiles instead of one-at-a-time with individual fence waits)
- Lift per-frame creature/GO spawn budgets during load screen warmup phase
- Add background world preloader: saves last world position to disk, pre-warms
  AssetManager file cache with ADT files starting at app init (login screen)
  so terrain workers get instant cache hits when Enter World is clicked
- Distance-filter expensive collision guard to 8-unit melee range
- Merge 3 CharacterRenderer update loops into single pass
- Time-budget instrumentation for slow update stages (>3ms threshold)
- Count-based async creature model upload budget (max 3/frame in-game)
- 1-per-frame game object spawn + per-doodad time budget for transport loading
- Use deque for creature spawn queue to avoid O(n) front-erase
2026-03-07 13:44:09 -08:00
Kelsi
e001aaa2b6 Suppress movement after teleport/portal, add shadow distance slider
- Add movementSuppressTimer to camera controller that forces all movement
  keys to read as false, preventing held W key from carrying through
  loading screens (fixes always-running-forward after instance portals)
- Increase shadow frustum default from 60 to 72 units (+20%)
- Make shadow distance configurable via setShadowDistance() (40-200 range)
- Add shadow distance slider in Video settings tab (persisted to config)
2026-03-06 20:38:58 -08:00
Kelsi
ad66ef9ca6 Fix shadow flicker: render every frame, tighten shadow frustum
Remove frame throttling that skipped shadow updates in dense scenes,
causing visible flicker on player and NPCs. Reduce shadow half-extent
from 180 to 60 for 3x higher resolution on nearby shadows.
2026-03-06 20:04:19 -08:00
Kelsi
5a227c0376 Add water refraction toggle with per-frame scene history
Fix VK_ERROR_DEVICE_LOST crash by allocating per-frame scene history
images (color + depth) instead of a single shared image that raced
between frames in flight. Water refraction can now be toggled via
Settings > Video > Water Refraction.

Without refraction: richer blue base colors, animated caustic shimmer,
and normal-based color shifts give the water visible life. With
refraction: clean screen-space refraction with Beer-Lambert absorption.
Disabling clears scene history to black for immediate fallback.
2026-03-06 19:15:34 -08:00
Kelsi
3482dacea8 Optimize M2 update loop: skip static doodads, incremental spatial index
Some checks failed
Build / Build (arm64) (push) Has been cancelled
Build / Build (x86-64) (push) Has been cancelled
Build / Build (macOS arm64) (push) Has been cancelled
Build / Build (windows-arm64) (push) Has been cancelled
Build / Build (windows-x86-64) (push) Has been cancelled
Security / CodeQL (C/C++) (push) Has been cancelled
Security / Semgrep (push) Has been cancelled
Security / Sanitizer Build (ASan/UBSan) (push) Has been cancelled
- Split M2 instances into fast-path index lists (animated, particle-only,
  particle-all, smoke) to avoid iterating all 46K instances per frame
- Cache model flags (hasAnimation, disableAnimation, isSmoke, etc.) on
  M2Instance struct to eliminate per-frame hash lookups
- Replace full rebuildSpatialIndex on position/transform updates with
  incremental grid cell remove+add, preventing 8.5ms/frame rebuild cost
- Advance animTime for all instances (texture UV animation) but only
  compute bones and particles for the ~3K that need it

M2_UPDATE: 10.7ms → 2.0ms, FPS: 35 → 55-59
2026-03-02 14:45:49 -08:00
Kelsi
7535084652 Disable captureSceneHistory to fix VK_ERROR_DEVICE_LOST crash
The single sceneColorImage races between frames with MAX_FRAMES_IN_FLIGHT=2:
frame N-1's water shader reads it while frame N's captureSceneHistory writes
it via vkCmdCopyImage. Pipeline barriers only sync within a single command
buffer, not across submissions on the same queue.

This caused VK_ERROR_DEVICE_LOST after ~700 frames on any map with water.
Disable the capture entirely for now — water renders without refraction.

TODO: allocate per-frame scene history images to eliminate the race.
2026-03-02 10:24:02 -08:00
Kelsi
335b1b1c3a Fix terrain loss after map transition and GPU crash on WMO-only maps
Three fixes:
1. Water captureSceneHistory gated on hasSurfaces() — the image layout
   transitions (PRESENT_SRC→TRANSFER_SRC→PRESENT_SRC) were running every
   frame even on WMO-only maps with no water, causing VK_ERROR_DEVICE_LOST.

2. Tile cache invalidation: softReset() now clears tileCache_ since cache
   keys are (x,y) without map name — prevents stale cross-map cache hits.

3. Copy terrain/mesh into TerrainTile instead of std::move — the moved-from
   PendingTile was cached with empty data, so subsequent map loads returned
   tiles with 0 valid chunks from cache.

Also adds diagnostic skip env vars (WOWEE_SKIP_TERRAIN, WOWEE_SKIP_SKY,
WOWEE_SKIP_PREPASSES) and a 0-chunk warning in loadTerrain.
2026-03-02 09:52:09 -08:00
Kelsi
0c5a915db3 Guard renderWorld/renderHUD against null command buffer after device lost
After VK_ERROR_DEVICE_LOST, beginFrame returns VK_NULL_HANDLE but
renderWorld() and renderHUD() were still called, passing the null
handle to vkCmdBindPipeline which triggered a validation abort.
2026-03-02 08:58:48 -08:00
Kelsi
f1caf8c03e Fix Stockades crash: suppress area triggers on initial login, handle VK_ERROR_DEVICE_LOST
Root cause: LOGIN_VERIFY_WORLD path did not set areaTriggerCheckTimer_ or
areaTriggerSuppressFirst_, so the Stockades exit portal (AT 503) fired
immediately on login, teleporting the player back to Stormwind and crashing
the GPU during the unexpected map transition.

Fixes:
- Set 5s area trigger cooldown + suppress-first in handleLoginVerifyWorld
  (same as SMSG_NEW_WORLD handler already did for teleports)
- Add deviceLost_ flag to VkContext so beginFrame returns immediately once
  VK_ERROR_DEVICE_LOST is detected, preventing infinite retry loops
- Track device lost from both fence wait and queue submit paths
2026-03-02 08:19:14 -08:00
Kelsi
3c55b09a3f Refactor instance loading: extract initializeRenderers, fix deferred state transition
- Extract initializeRenderers() from loadTestTerrain() so WMO-only maps
  (dungeons/raids) initialize renderers directly without a dummy ADT path
- Defer setState(IN_GAME) until after processing any pending deferred world
  entry, preventing brief IN_GAME flicker on the wrong map
- Remove verbose area trigger debug logging (every-second position spam)
2026-03-02 08:11:36 -08:00
Kelsi
48eb0b70a3 Fix GPU resource leaks and re-entrant world loading for instance transitions
Reset descriptor pools in CharacterRenderer/M2Renderer/WMORenderer on map
change to prevent VK_ERROR_DEVICE_LOST from pool exhaustion. Defer re-entrant
SMSG_NEW_WORLD during active world load to avoid recursive cleanup crashes.
Gate swim bubbles on swimming state, skip redundant shadow pipeline re-init,
add WOWEE_SKIP_* env vars for render isolation debugging.
2026-03-02 08:06:35 -08:00
Kelsi
a559d5944b Fix shutdown hangs, bank bag icons/drag-drop, loading screen progress, and login spawn
- Fix shutdown hang: skip vmaDestroyAllocator (walked thousands of allocations),
  replace unsafe pthread_timedjoin_np with plain join + early-exit checks in workers
- Bank window: full icon rendering, click-and-hold pickup (0.10s), drag-drop for
  all bank slots including bank bag equip slots, same-slot drop detection
- Loading screen: process one tile per frame for live progress updates
- Camera reset: trust server position in online mode to avoid spawning under WMOs
- Fix PLAYER_BYTES/PLAYER_BYTES_2 field indices, preserve purchasedBankBagSlots
  across inventory rebuilds, fix bank slot purchase result codes
2026-02-26 13:38:29 -08:00
Kelsi
2219ccde51 Optimize city performance and harden WMO grounding 2026-02-25 10:22:05 -08:00
Kelsi
d65b170774 Make shadows follow player movement continuously
Remove freeze-while-moving and idle smoothing logic from shadow
center computation. Texel snapping already prevents shimmer, so
the shadow projection can track the player directly each frame.
2026-02-23 08:47:38 -08:00