Commit graph

1066 commits

Author SHA1 Message Date
Kelsi
94ad89c764 Set native-focused FSR defaults and reorder quality UI 2026-03-08 21:34:31 -07:00
Kelsi
47287a121a Fix AMD FSR2 CI by removing Wine permutation generation 2026-03-08 21:22:21 -07:00
Kelsi
e6d373df3e Remove FSR performance preset and add native quality mode 2026-03-08 21:17:04 -07:00
Kelsi
f6fce0f19a Fix FSR2 jitter mismatch between projection and dispatch 2026-03-08 21:10:36 -07:00
Kelsi
eccbfb6a5f Tune FSR2 defaults and simplify jitter controls 2026-03-08 21:08:17 -07:00
Kelsi
2e71c768db Add live FSR2 motion/jitter tuning controls and HUD readout 2026-03-08 20:56:22 -07:00
Kelsi
38c55e4f37 Add startup FSR2 safety fallback to prevent load hangs 2026-03-08 20:48:46 -07:00
Kelsi
ad2915ce9e Defer persisted FSR2 activation until world load completes 2026-03-08 20:45:26 -07:00
Kelsi
1c452018e1 Fix AMD CI shader permutations for tcr/autoreactive passes 2026-03-08 20:35:52 -07:00
Kelsi
96f7728227 Add AMD FSR2 CI build job and adjust jitter offset sign 2026-03-08 20:27:39 -07:00
Kelsi
a12126cc7e Persist upscaling mode and refine FSR2 jitter behavior 2026-03-08 20:22:11 -07:00
Kelsi
6fd1c94d99 Tune AMD FSR2 motion vectors to reduce residual jitter 2026-03-08 20:15:54 -07:00
Kelsi
0a88406a3d Reduce FSR2 jitter by preserving temporal history 2026-03-08 20:07:01 -07:00
Kelsi
51a8cf565f Integrate AMD FSR2 backend and document SDK bootstrap 2026-03-08 19:56:52 -07:00
Kelsi
a24ff375fb Add AMD FSR2 SDK detection and backend integration scaffolding 2026-03-08 19:33:07 -07:00
Kelsi
e2a2316038 Stabilize FSR2 path and refine temporal pipeline groundwork 2026-03-08 18:52:04 -07:00
Kelsi
a8500a80b5 FSR2: selective clamp, tonemapped accumulation, terrain load radius 3
Some checks are pending
Build / Build (arm64) (push) Waiting to run
Build / Build (x86-64) (push) Waiting to run
Build / Build (macOS arm64) (push) Waiting to run
Build / Build (windows-arm64) (push) Waiting to run
Build / Build (windows-x86-64) (push) Waiting to run
Security / CodeQL (C/C++) (push) Waiting to run
Security / Semgrep (push) Waiting to run
Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run
- Selective neighborhood clamp: only modify history when there's actual
  motion or disocclusion — static pixels pass history through untouched,
  preventing jitter-chasing from the shifting variance box
- Tonemapped accumulation: Reinhard tonemap before blend compresses bright
  edges so they don't disproportionately cause jitter
- Jitter-aware sample weighting: blend 3-20% based on sample proximity
- Soft MV dead zone: smoothstep instead of step avoids spatial discontinuity
- Aggressive velocity response: 30%/px motion, 50% cap, 80% disocclusion
- Terrain loading: radius 3 (49 tiles) to prevent spawn hitches,
  processOneReadyTile for smooth progress bar updates
2026-03-08 15:15:44 -07:00
Kelsi
2003cc8aaa FSR2: de-jitter scene sampling, fix loading screen progress
FSR2 temporal upscaling:
- De-jitter scene color sampling (outUV - jitterUV) for frame-to-frame
  consistency, eliminating the primary source of temporal jitter
- Remove luminance instability dampening (was causing excessive blur)
- Simplify to uniform 8% blend (de-jittered values are consistent)
- Gamma 2.0 for moderate neighborhood clamping
- Motion vector dead zone: zero sub-0.01px motion from float precision noise

Loading screen:
- Reduce tile load radius from 3 to 2 (25 tiles) for faster loading
- Process one tile per iteration for smooth progress bar updates
2026-03-08 14:50:14 -07:00
Kelsi
f74dcc37e0 FSR2: reduce doubling via tighter clamp, MV dead zone, luminance stability
- Motion shader: zero out sub-0.01px motion to eliminate float precision
  noise in reprojection (distant geometry with large world coords)
- Accumulate: tighten neighborhood clamp gamma 3.0→1.5 to catch slightly
  misaligned history causing ghost doubles
- Reduce max jitter-aware blend 30%→20% for less visible oscillation
- Add luminance instability dampening: reduce blend when current frame
  disagrees with history to prevent shimmer on small/distant features
2026-03-08 14:34:58 -07:00
Kelsi Rae Davis
e8bbb17196
Merge pull request #12 from rursache/feat/aur-pkgbuild
feat: add AUR PKGBUILD for wowee-git
2026-03-08 14:33:30 -07:00
Kelsi Rae Davis
c21da8a1e2
Merge pull request #11 from rursache/fix/arch-vulkan-headers-dep
fix(arch): add vulkan-headers to Arch Linux dependency list
2026-03-08 14:32:57 -07:00
Kelsi
c3047c33ba FSR2: fix motion vector jitter, add bicubic anti-ringing, depth-dilated MVs
- Motion shader: unjitter NDC before reprojection (ndc+jitter, not ndc-jitter),
  compute motion against unjittered UV so static scenes produce zero motion
- Pass jitter offset to motion shader (push constant 80→96 bytes)
- Accumulate shader: restore Catmull-Rom bicubic with anti-ringing clamp to
  prevent negative-lobe halos at edges while maintaining sharpness
- Add depth-dilated motion vectors (3x3 nearest-to-camera) to prevent
  background MVs bleeding over foreground edges
- Widen neighborhood clamp gamma to 3.0, uniform 5% blend with
  disocclusion/velocity reactive boosting
2026-03-08 14:18:00 -07:00
Radu Ursache
54ae05d298 feat: add AUR PKGBUILD for wowee-git
Adds a wowee-git PKGBUILD suitable for submission to the Arch User
Repository. Tracks the main branch HEAD; pkgver is auto-generated from
the commit count + short hash so no manual bumping is needed on new
releases.

Key design decisions:
- Real binaries installed to /usr/lib/wowee/ to avoid PATH clutter
- /usr/bin/wowee wrapper sets WOW_DATA_PATH to the user's XDG data dir
  (~/.local/share/wowee/Data) so the asset path works without any
  user configuration
- /usr/bin/wowee-extract-assets helper runs asset_extract pointed at
  the same XDG data dir; users run this once against their WoW client
- Submodules (imgui, vk-bootstrap) fetched from local git mirrors
  during prepare() as required by AUR source array rules
- vulkan-headers listed as makedepend (required by imgui Vulkan backend
  and vk-bootstrap at compile time; not needed at runtime)

Note: stormlib is an AUR dependency (aur/stormlib). Users will need
an AUR helper (yay, paru) to install it, or install it manually first.
2026-03-08 12:55:19 +02:00
Radu Ursache
23beae96e2 fix(arch): add vulkan-headers to Arch Linux dependency list
vulkan-headers provides <vulkan/vulkan.h> which is required at compile
time by imgui (imgui_impl_vulkan.cpp) and vk-bootstrap. On Arch,
vulkan-devel is not a package name — the headers must be installed
explicitly via vulkan-headers.

Also replace vulkan-devel with the correct individual packages:
  vulkan-headers  (build-time headers)
  vulkan-icd-loader / vulkan-tools (runtime + utilities)

Fixes build failure: fatal error: vulkan/vulkan.h: No such file or directory
2026-03-08 12:51:33 +02:00
Kelsi
e94eb7f2d1 FSR2 temporal upscaling fixes: unjittered reprojection, sharpen Y-flip, MSAA guard, descriptor double-buffering
Some checks are pending
Build / Build (arm64) (push) Waiting to run
Build / Build (x86-64) (push) Waiting to run
Build / Build (macOS arm64) (push) Waiting to run
Build / Build (windows-arm64) (push) Waiting to run
Build / Build (windows-x86-64) (push) Waiting to run
Security / CodeQL (C/C++) (push) Waiting to run
Security / Semgrep (push) Waiting to run
Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run
- Motion vectors: single unjittered reprojection matrix (80 bytes) instead of
  two jittered matrices (160 bytes), eliminating numerical instability from
  jitter amplification through large world coordinates
- Sharpen pass: fix Y-flip for correct UV sampling, double-buffer descriptor
  sets to avoid race with in-flight command buffers
- MSAA: auto-disable when FSR2 enabled, grey out AA setting in UI
- Accumulation: variance-based neighborhood clamping in YCoCg space,
  correct history layout transitions
- Frame index: wrap at 256 for stable Halton sequence
2026-03-08 01:22:15 -08:00
Kelsi
52317d1edd Implement FSR 2.2 temporal upscaling
Full FSR 2.2 pipeline with depth-based motion vector reprojection,
temporal accumulation with YCoCg neighborhood clamping, and RCAS
contrast-adaptive sharpening.

Architecture (designed for FSR 3.x frame generation readiness):
- Camera: Halton(2,3) sub-pixel jitter with unjittered projection
  stored separately for motion vector computation
- Motion vectors: compute shader reconstructs world position from
  depth + inverse VP, reprojects with previous frame's VP
- Temporal accumulation: compute shader blends 5-10% current frame
  with 90-95% clamped history, adaptive blend for disocclusion
- History: ping-pong R16G16B16A16 buffers at display resolution
- Sharpening: RCAS fragment pass with contrast-adaptive weights

Integration:
- FSR2 replaces both FSR1 and MSAA when enabled
- Scene renders to internal resolution framebuffer (no MSAA)
- Compute passes run between scene and swapchain render passes
- Camera cut detection resets history on teleport
- Quality presets shared with FSR1 (0.50-0.77 scale factors)
- UI: "Upscaling" combo with Off/FSR 1.0/FSR 2.2 options
2026-03-07 23:13:01 -08:00
Kelsi
0ffeabd4ed Revert "Further reduce tile streaming aggressiveness"
This reverts commit f681a8b361.
2026-03-07 23:02:25 -08:00
Kelsi
f681a8b361 Further reduce tile streaming aggressiveness
- Load radius: 4→3 (normal), 6→5 (taxi)
- Terrain chunks per step: 16→8
- M2 models per step: 6→2 (removed idle boost)
- WMO models per step: 2→1 (removed idle boost)
- WMO doodads per step: 4→2
- All budgets now constant (no idle-vs-busy branching)
2026-03-07 22:55:02 -08:00
Kelsi
7f573fc06b Reduce tile finalization aggressiveness to prevent spawn hitching
- Reduce max finalization steps per frame: 2→1 (normal), 8→4 (taxi)
- Reduce terrain chunk upload batch: 32→16 chunks per step
- Reduce idle M2 model upload budget: 16→6 per step
- Reduce idle WMO model upload budget: 4→2 per step

Tiles still stream in quickly but spread GPU upload work across
more frames, eliminating the frame spikes right after spawning.
2026-03-07 22:51:59 -08:00
Kelsi
ac3c90dd75 Fix M2 animated instance flashing (deer/bird/critter pop-in)
Root cause: bonesDirty was a single bool shared across both
double-buffered frame indices. When bones were copied to frame 0's
SSBO and bonesDirty cleared, frame 1's newly-allocated SSBO would
contain garbage/zeros and never get populated — causing animated
M2 instances to flash invisible on alternating frames.

Fix: Make bonesDirty per-frame-index (bool[2]) so each buffer
independently tracks whether it needs bone data uploaded. When
bones are recomputed, both indices are marked dirty. When uploaded
during render, only the current frame index is cleared. New buffer
allocations in prepareRender force their frame index dirty.
2026-03-07 22:47:07 -08:00
Kelsi
6cf08fbaa6 Throttle proactive tile streaming to reduce post-load hitching
Add 2-second cooldown timer before re-checking for unloaded tiles
when workers are idle, preventing excessive streamTiles() calls
that caused frame hitches right after world load.
2026-03-07 22:40:07 -08:00
Kelsi
c13dbf2198 Proactive tile streaming, faster finalization, tree trunk collision
- Re-check for unloaded tiles when workers are idle (no tile boundary needed)
- Increase M2 upload budget 4→16 and WMO 1→4 per frame when not under pressure
- Lower tree collision threshold from 40 to 6 units so large trees block movement
2026-03-07 22:35:18 -08:00
Kelsi
4cb03c38fe Parallel animation updates, thread-safe collision, M2 pop-in fix, shadow stabilization
- Overlap M2 and character animation updates via std::async (~2-5ms saved)
- Thread-local collision scratch buffers for concurrent floor queries
- Parallel terrain/WMO/M2 floor queries in camera controller
- Seed new M2 instance bones from existing siblings to eliminate pop-in flash
- Fix shadow flicker: snap center along stable light axes instead of in view space
- Increase shadow distance default to 300 units (slider max 500)
2026-03-07 22:29:06 -08:00
Kelsi
a4966e486f Fix WMO wall collision, normal mapping, POM backfill, and M2/WMO rendering performance
- Fix MOPY flag check (0x08 not 0x01) for proper wall collision detection
- Cap MAX_PUSH to PLAYER_RADIUS to prevent gradual clip-through
- Fix WMO doodad quaternion component ordering (X/Y swap)
- Linear normal map strength blend in shader for smooth slider control
- Enable shadow sampling for interior WMO groups (covered outdoor areas)
- Backfill deferred normal/height maps after streaming with descriptor rebind
- M2: prepareRender only iterates animated instances, bone dirty flag
- M2: remove worker thread VMA allocation, skip unready bone instances
- WMO: persistent visibility vectors, sequential culling
- Add FSR EASU/RCAS shaders
2026-03-07 22:03:28 -08:00
Kelsi
16c6c2b6a0 Raise diagnostic log thresholds to reduce log noise
SLOW update stages: 3ms → 50ms, renderer update: 5ms → 50ms,
loadModel/processAsync/spawnCreature: 3ms → 100ms,
terrain/camera: 3-5ms → 50ms. Remove per-frame spawn breakdown.
2026-03-07 18:43:13 -08:00
Kelsi
02cf0e4df3 Background normal map generation, queue-draining load screen warmup
- Normal map CPU work (luminance→blur→Sobel) moved to background threads,
  main thread only does GPU upload (~1-2ms vs 15-22ms per texture)
- Load screen warmup now waits until ALL spawn/equipment/gameobject queues
  are drained before transitioning (prevents naked character, NPC pop-in)
- Exit condition: min 2s + 5 consecutive empty iterations, hard cap 15s
- Equipment queue processes 8 items per warmup iteration instead of 1
- Added LoadingScreen::renderOverlay() for future world-behind-loading use
2026-03-07 18:40:24 -08:00
Kelsi
63efac9fa6 Unlimited creature model uploads during load screen, remove duplicate code
Loading screen now calls processCreatureSpawnQueue(unlimited=true) which
removes the 1-upload-per-frame cap and 2ms time budget, allowing all pending
creature models to upload to GPU in bulk. Also increases concurrent async
background loads from 4 to 16 during load screen. Replaces 40-line inline
duplicate of processAsyncCreatureResults with the shared function.
2026-03-07 17:31:47 -08:00
Kelsi
24f2ec75ec Defer normal map generation to reduce GPU model upload stalls by ~50%
Some checks are pending
Build / Build (arm64) (push) Waiting to run
Build / Build (x86-64) (push) Waiting to run
Build / Build (macOS arm64) (push) Waiting to run
Build / Build (windows-arm64) (push) Waiting to run
Build / Build (windows-x86-64) (push) Waiting to run
Security / CodeQL (C/C++) (push) Waiting to run
Security / Semgrep (push) Waiting to run
Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run
Each loadTexture call was generating a normal/height map inline (3 full-image
passes: luminance + blur + Sobel). For models with 15-20 textures this added
30-40ms to the 70ms model upload. Now deferred to a per-frame budget (2/frame
in-game, 10/frame during load screen). Models render without POM until their
normal maps are ready.
2026-03-07 17:16:38 -08:00
Kelsi
faca22ac5f Async humanoid NPC texture pipeline to eliminate 30-150ms main-thread stalls
Move all DBC lookups (CharSections, ItemDisplayInfo), texture path resolution,
and BLP decoding for humanoid NPCs to background threads. Only GPU texture
uploads remain on the main thread via pre-decoded BLP cache.
2026-03-07 16:54:58 -08:00
Kelsi
7ac990cff4 Background BLP texture pre-decoding + deferred WMO normal maps (12x streaming perf)
Move CPU-heavy BLP texture decoding from main thread to background worker
threads for all hot paths: terrain M2 models, WMO doodad M2s, WMO textures,
creature models, and gameobject WMOs. Each renderer (M2, WMO, Character) now
accepts a pre-decoded BLP cache that loadTexture() checks before falling back
to synchronous decode.

Defer WMO normal/height map generation (3 per-pixel passes: luminance, box
blur, Sobel) during terrain streaming finalization — this was the dominant
remaining bottleneck after BLP pre-decoding.

Terrain streaming stalls: 1576ms → 124ms worst case.
2026-03-07 15:46:56 -08:00
Kelsi
0313bd8692 Performance: ring buffer UBOs, batched load screen uploads, background world preloader
- Replace per-frame VMA alloc/free of material UBOs with a ring buffer in
  CharacterRenderer (~500 allocations/frame eliminated)
- Batch all ready terrain tiles into a single GPU upload during load screen
  (processAllReadyTiles instead of one-at-a-time with individual fence waits)
- Lift per-frame creature/GO spawn budgets during load screen warmup phase
- Add background world preloader: saves last world position to disk, pre-warms
  AssetManager file cache with ADT files starting at app init (login screen)
  so terrain workers get instant cache hits when Enter World is clicked
- Distance-filter expensive collision guard to 8-unit melee range
- Merge 3 CharacterRenderer update loops into single pass
- Time-budget instrumentation for slow update stages (>3ms threshold)
- Count-based async creature model upload budget (max 3/frame in-game)
- 1-per-frame game object spawn + per-doodad time budget for transport loading
- Use deque for creature spawn queue to avoid O(n) front-erase
2026-03-07 13:44:09 -08:00
Kelsi
71e8ed5b7d Reduce initial load to radius 1 (~5 tiles) for fast game entry
Was waiting for all ~50 tiles (radius 4) to fully prepare + finalize
before entering the game. Now loads only the immediate surrounding tiles
during the loading screen, then restores the full radius for in-game
streaming. setLoadRadius just sets an int — actual loading happens lazily
via background workers during the game loop.
2026-03-07 12:39:38 -08:00
Kelsi
25bb63c50a Faster terrain/model loading: more workers, batched finalization, skip redundant I/O
- Worker threads: use (cores - 1-2) instead of cores/2, minimum 4
- Outer upload batch in processReadyTiles: ALL model/texture uploads per
  frame share a single command buffer submission + fence wait
- Upload multiple models per finalization step: 8 M2s, 4 WMOs, 16 doodads
  per call instead of 1 each (all within same GPU batch)
- Terrain chunks: 64 per step instead of 16
- Skip redundant M2 file I/O: thread-safe uploadedM2Ids_ set lets
  background workers skip re-reading+parsing models already on GPU
- processAllReadyTiles (loading screen) and processOneReadyTile also
  wrapped in outer upload batches
2026-03-07 12:32:39 -08:00
Kelsi
16b4336700 Batch GPU uploads to eliminate per-upload fence waits (stutter fix)
Every uploadBuffer/VkTexture::upload called immediateSubmit which did a
separate vkQueueSubmit + vkWaitForFences. Loading a single creature model
with textures caused 4-8+ fence waits; terrain chunks caused 80+ per batch.

Added beginUploadBatch/endUploadBatch to VkContext: records all upload
commands into a single command buffer, submits once with one fence wait.
Staging buffers are deferred for cleanup after the batch completes.

Wrapped in batch mode:
- CharacterRenderer::loadModel (creature VB/IB + textures)
- M2Renderer::loadModel (doodad VB/IB + textures)
- TerrainRenderer::loadTerrain/loadTerrainIncremental (chunk geometry + textures)
- TerrainRenderer::uploadPreloadedTextures
- WMORenderer::loadModel (group geometry + textures)
2026-03-07 12:19:59 -08:00
Kelsi
884b72bc1c Incremental terrain upload + M2 instance dedup hash for city stutter
Terrain finalization was uploading all 256 chunks (GPU fence waits) in one
atomic advanceFinalization call that couldn't be interrupted by the 5ms time
budget. Now split into incremental batches of 16 chunks per call, allowing
the time budget to yield between batches.

M2 instance creation had O(N) dedup scans iterating ALL instances to check
for duplicates. In cities with 5000+ doodads, this caused O(N²) total work
during tile loading. Replaced with hash-based DedupKey map for O(1) lookups.

Changes:
- TerrainRenderer::loadTerrainIncremental: uploads N chunks per call
- FinalizingTile tracks terrainChunkNext for cross-frame progress
- TERRAIN phase yields after preload and after each chunk batch
- M2Renderer::DedupKey hash map replaces linear scan in createInstance
  and createInstanceWithMatrix
- Dedup map maintained through rebuildSpatialIndex and clear paths
2026-03-07 11:59:19 -08:00
Kelsi
f9410cc4bd Fix city NPC stuttering: async model loading, CharSections cache, frame budgets
- Async creature model loading: M2 file I/O and parsing on background threads
  via std::async, GPU upload on main thread when ready (MAX_ASYNC_CREATURE_LOADS=4)
- CharSections.dbc lookup cache: O(1) hash lookup instead of O(N) full DBC scan
  per humanoid NPC spawn (was scanning thousands of records twice per spawn)
- Frame time budget: 4ms cap on creature spawn processing per frame
- Wolf/worg model name check cached per modelId (was doing tolower+find per
  hostile creature per frame)
- Weapon attach throttle: max 2 per 1s tick (was attempting all unweaponized NPCs)
- Separate texture application tracking (displayIdTexturesApplied_) so async-loaded
  models still get skin/equipment textures applied correctly
2026-03-07 11:44:14 -08:00
Kelsi
f374e19239 Comprehensive README/status update covering 60+ commits since Feb 2026
Some checks are pending
Build / Build (arm64) (push) Waiting to run
Build / Build (x86-64) (push) Waiting to run
Build / Build (macOS arm64) (push) Waiting to run
Build / Build (windows-arm64) (push) Waiting to run
Build / Build (windows-x86-64) (push) Waiting to run
Security / CodeQL (C/C++) (push) Waiting to run
Security / Semgrep (push) Waiting to run
Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run
Add instances, bank, auction house, mail, pets, party stats, map
exploration, talent/spellbook revamp, chest looting, spirit healer,
CI builds, performance optimizations, shadow/collision/lava fixes.
2026-03-07 00:55:34 -08:00
Kelsi
88dc1f94a6 Update README and status docs for v0.3.6-preview
- Add water/lava/lighting rendering features to README
- Add transport riding to movement features
- Update status date and rendering capabilities
- Note interior shadow and lava steam known gaps
2026-03-07 00:50:45 -08:00
Kelsi
41218a3b04 Remove diagnostic logging for lava/steam/MLIQ 2026-03-07 00:49:11 -08:00
Kelsi
a24fe4cc45 Ironforge Great Forge lava, magma water rendering, LavaSteam particle effects
- Add magma/slime rendering path to water shader (fbm noise, crust/molten/core coloring)
- Fix WMO liquid height filter rejecting high-altitude zones like Ironforge (Z>300)
- Allow interior WMO magma/slime MLIQ groups to load (skip only water/ocean)
- Mark LAVASTEAM.m2 as spell effect for proper additive blend, hide emission mesh
- Add isLavaModel flag for M2 ForgeLava/LavaPots UV scroll fallback
- Add isLava material detection in WMO renderer for lava texture UV animation
- Fix WMO material UBO colors for magma (was blue, now orange-red)
2026-03-07 00:48:04 -08:00