Kelsidavis-WoWee

mirror of https://github.com/Kelsidavis/WoWee.git synced 2026-05-08 10:03:51 +00:00

Author	SHA1	Message	Date
Kelsi	aa43aa6fc8	Bridge FSR3 Vulkan framegen dispatch and route sharpen to interpolated output	2026-03-08 23:20:50 -07:00
Kelsi	538a1db866	Fix FSR3 runtime wrapper for local SDK API and real Vulkan resource dispatch	2026-03-08 23:13:08 -07:00
Kelsi	f1099f5940	Add FSR3 runtime library probing and readiness status	2026-03-08 23:03:45 -07:00
Kelsi	e1c93c47be	Stage FSR3 framegen dispatch hook in frame pipeline	2026-03-08 22:57:35 -07:00
Kelsi	bdfec103ac	Add persisted AMD FSR3 framegen runtime toggle plumbing	2026-03-08 22:53:21 -07:00
Kelsi	a49decd9a6	Add AMD FSR3 framegen interface probe and CI validation	2026-03-08 22:47:46 -07:00
Kelsi	7d89aabae5	Make all CI build jobs AMD-FSR2-only	2026-03-08 21:51:42 -07:00
Kelsi	700e05b142	Bootstrap AMD FSR2 Vulkan permutations cross-platform	2026-03-08 21:45:25 -07:00
Kelsi	09cdcd67b5	Update FSR docs and make AMD CI header check non-fatal	2026-03-08 21:40:26 -07:00
Kelsi	94ad89c764	Set native-focused FSR defaults and reorder quality UI	2026-03-08 21:34:31 -07:00
Kelsi	47287a121a	Fix AMD FSR2 CI by removing Wine permutation generation	2026-03-08 21:22:21 -07:00
Kelsi	e6d373df3e	Remove FSR performance preset and add native quality mode	2026-03-08 21:17:04 -07:00
Kelsi	f6fce0f19a	Fix FSR2 jitter mismatch between projection and dispatch	2026-03-08 21:10:36 -07:00
Kelsi	eccbfb6a5f	Tune FSR2 defaults and simplify jitter controls	2026-03-08 21:08:17 -07:00
Kelsi	2e71c768db	Add live FSR2 motion/jitter tuning controls and HUD readout	2026-03-08 20:56:22 -07:00
Kelsi	38c55e4f37	Add startup FSR2 safety fallback to prevent load hangs	2026-03-08 20:48:46 -07:00
Kelsi	ad2915ce9e	Defer persisted FSR2 activation until world load completes	2026-03-08 20:45:26 -07:00
Kelsi	1c452018e1	Fix AMD CI shader permutations for tcr/autoreactive passes	2026-03-08 20:35:52 -07:00
Kelsi	96f7728227	Add AMD FSR2 CI build job and adjust jitter offset sign	2026-03-08 20:27:39 -07:00
Kelsi	a12126cc7e	Persist upscaling mode and refine FSR2 jitter behavior	2026-03-08 20:22:11 -07:00
Kelsi	6fd1c94d99	Tune AMD FSR2 motion vectors to reduce residual jitter	2026-03-08 20:15:54 -07:00
Kelsi	0a88406a3d	Reduce FSR2 jitter by preserving temporal history	2026-03-08 20:07:01 -07:00
Kelsi	51a8cf565f	Integrate AMD FSR2 backend and document SDK bootstrap	2026-03-08 19:56:52 -07:00
Kelsi	a24ff375fb	Add AMD FSR2 SDK detection and backend integration scaffolding	2026-03-08 19:33:07 -07:00
Kelsi	e2a2316038	Stabilize FSR2 path and refine temporal pipeline groundwork	2026-03-08 18:52:04 -07:00
Kelsi	a8500a80b5	FSR2: selective clamp, tonemapped accumulation, terrain load radius 3 Some checks are pending Build / Build (arm64) (push) Waiting to run Details Build / Build (x86-64) (push) Waiting to run Details Build / Build (macOS arm64) (push) Waiting to run Details Build / Build (windows-arm64) (push) Waiting to run Details Build / Build (windows-x86-64) (push) Waiting to run Details Security / CodeQL (C/C++) (push) Waiting to run Details Security / Semgrep (push) Waiting to run Details Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run Details - Selective neighborhood clamp: only modify history when there's actual motion or disocclusion — static pixels pass history through untouched, preventing jitter-chasing from the shifting variance box - Tonemapped accumulation: Reinhard tonemap before blend compresses bright edges so they don't disproportionately cause jitter - Jitter-aware sample weighting: blend 3-20% based on sample proximity - Soft MV dead zone: smoothstep instead of step avoids spatial discontinuity - Aggressive velocity response: 30%/px motion, 50% cap, 80% disocclusion - Terrain loading: radius 3 (49 tiles) to prevent spawn hitches, processOneReadyTile for smooth progress bar updates	2026-03-08 15:15:44 -07:00
Kelsi	2003cc8aaa	FSR2: de-jitter scene sampling, fix loading screen progress FSR2 temporal upscaling: - De-jitter scene color sampling (outUV - jitterUV) for frame-to-frame consistency, eliminating the primary source of temporal jitter - Remove luminance instability dampening (was causing excessive blur) - Simplify to uniform 8% blend (de-jittered values are consistent) - Gamma 2.0 for moderate neighborhood clamping - Motion vector dead zone: zero sub-0.01px motion from float precision noise Loading screen: - Reduce tile load radius from 3 to 2 (25 tiles) for faster loading - Process one tile per iteration for smooth progress bar updates	2026-03-08 14:50:14 -07:00
Kelsi	f74dcc37e0	FSR2: reduce doubling via tighter clamp, MV dead zone, luminance stability - Motion shader: zero out sub-0.01px motion to eliminate float precision noise in reprojection (distant geometry with large world coords) - Accumulate: tighten neighborhood clamp gamma 3.0→1.5 to catch slightly misaligned history causing ghost doubles - Reduce max jitter-aware blend 30%→20% for less visible oscillation - Add luminance instability dampening: reduce blend when current frame disagrees with history to prevent shimmer on small/distant features	2026-03-08 14:34:58 -07:00
Kelsi Rae Davis	e8bbb17196	Merge pull request #12 from rursache/feat/aur-pkgbuild feat: add AUR PKGBUILD for wowee-git	2026-03-08 14:33:30 -07:00
Kelsi Rae Davis	c21da8a1e2	Merge pull request #11 from rursache/fix/arch-vulkan-headers-dep fix(arch): add vulkan-headers to Arch Linux dependency list	2026-03-08 14:32:57 -07:00
Kelsi	c3047c33ba	FSR2: fix motion vector jitter, add bicubic anti-ringing, depth-dilated MVs - Motion shader: unjitter NDC before reprojection (ndc+jitter, not ndc-jitter), compute motion against unjittered UV so static scenes produce zero motion - Pass jitter offset to motion shader (push constant 80→96 bytes) - Accumulate shader: restore Catmull-Rom bicubic with anti-ringing clamp to prevent negative-lobe halos at edges while maintaining sharpness - Add depth-dilated motion vectors (3x3 nearest-to-camera) to prevent background MVs bleeding over foreground edges - Widen neighborhood clamp gamma to 3.0, uniform 5% blend with disocclusion/velocity reactive boosting	2026-03-08 14:18:00 -07:00
Radu Ursache	54ae05d298	feat: add AUR PKGBUILD for wowee-git Adds a wowee-git PKGBUILD suitable for submission to the Arch User Repository. Tracks the main branch HEAD; pkgver is auto-generated from the commit count + short hash so no manual bumping is needed on new releases. Key design decisions: - Real binaries installed to /usr/lib/wowee/ to avoid PATH clutter - /usr/bin/wowee wrapper sets WOW_DATA_PATH to the user's XDG data dir (~/.local/share/wowee/Data) so the asset path works without any user configuration - /usr/bin/wowee-extract-assets helper runs asset_extract pointed at the same XDG data dir; users run this once against their WoW client - Submodules (imgui, vk-bootstrap) fetched from local git mirrors during prepare() as required by AUR source array rules - vulkan-headers listed as makedepend (required by imgui Vulkan backend and vk-bootstrap at compile time; not needed at runtime) Note: stormlib is an AUR dependency (aur/stormlib). Users will need an AUR helper (yay, paru) to install it, or install it manually first.	2026-03-08 12:55:19 +02:00
Radu Ursache	23beae96e2	fix(arch): add vulkan-headers to Arch Linux dependency list vulkan-headers provides <vulkan/vulkan.h> which is required at compile time by imgui (imgui_impl_vulkan.cpp) and vk-bootstrap. On Arch, vulkan-devel is not a package name — the headers must be installed explicitly via vulkan-headers. Also replace vulkan-devel with the correct individual packages: vulkan-headers (build-time headers) vulkan-icd-loader / vulkan-tools (runtime + utilities) Fixes build failure: fatal error: vulkan/vulkan.h: No such file or directory	2026-03-08 12:51:33 +02:00
Kelsi	e94eb7f2d1	FSR2 temporal upscaling fixes: unjittered reprojection, sharpen Y-flip, MSAA guard, descriptor double-buffering Some checks are pending Build / Build (arm64) (push) Waiting to run Details Build / Build (x86-64) (push) Waiting to run Details Build / Build (macOS arm64) (push) Waiting to run Details Build / Build (windows-arm64) (push) Waiting to run Details Build / Build (windows-x86-64) (push) Waiting to run Details Security / CodeQL (C/C++) (push) Waiting to run Details Security / Semgrep (push) Waiting to run Details Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run Details - Motion vectors: single unjittered reprojection matrix (80 bytes) instead of two jittered matrices (160 bytes), eliminating numerical instability from jitter amplification through large world coordinates - Sharpen pass: fix Y-flip for correct UV sampling, double-buffer descriptor sets to avoid race with in-flight command buffers - MSAA: auto-disable when FSR2 enabled, grey out AA setting in UI - Accumulation: variance-based neighborhood clamping in YCoCg space, correct history layout transitions - Frame index: wrap at 256 for stable Halton sequence	2026-03-08 01:22:15 -08:00
Kelsi	52317d1edd	Implement FSR 2.2 temporal upscaling Full FSR 2.2 pipeline with depth-based motion vector reprojection, temporal accumulation with YCoCg neighborhood clamping, and RCAS contrast-adaptive sharpening. Architecture (designed for FSR 3.x frame generation readiness): - Camera: Halton(2,3) sub-pixel jitter with unjittered projection stored separately for motion vector computation - Motion vectors: compute shader reconstructs world position from depth + inverse VP, reprojects with previous frame's VP - Temporal accumulation: compute shader blends 5-10% current frame with 90-95% clamped history, adaptive blend for disocclusion - History: ping-pong R16G16B16A16 buffers at display resolution - Sharpening: RCAS fragment pass with contrast-adaptive weights Integration: - FSR2 replaces both FSR1 and MSAA when enabled - Scene renders to internal resolution framebuffer (no MSAA) - Compute passes run between scene and swapchain render passes - Camera cut detection resets history on teleport - Quality presets shared with FSR1 (0.50-0.77 scale factors) - UI: "Upscaling" combo with Off/FSR 1.0/FSR 2.2 options	2026-03-07 23:13:01 -08:00
Kelsi	0ffeabd4ed	Revert "Further reduce tile streaming aggressiveness" This reverts commit `f681a8b361`.	2026-03-07 23:02:25 -08:00
Kelsi	f681a8b361	Further reduce tile streaming aggressiveness - Load radius: 4→3 (normal), 6→5 (taxi) - Terrain chunks per step: 16→8 - M2 models per step: 6→2 (removed idle boost) - WMO models per step: 2→1 (removed idle boost) - WMO doodads per step: 4→2 - All budgets now constant (no idle-vs-busy branching)	2026-03-07 22:55:02 -08:00
Kelsi	7f573fc06b	Reduce tile finalization aggressiveness to prevent spawn hitching - Reduce max finalization steps per frame: 2→1 (normal), 8→4 (taxi) - Reduce terrain chunk upload batch: 32→16 chunks per step - Reduce idle M2 model upload budget: 16→6 per step - Reduce idle WMO model upload budget: 4→2 per step Tiles still stream in quickly but spread GPU upload work across more frames, eliminating the frame spikes right after spawning.	2026-03-07 22:51:59 -08:00
Kelsi	ac3c90dd75	Fix M2 animated instance flashing (deer/bird/critter pop-in) Root cause: bonesDirty was a single bool shared across both double-buffered frame indices. When bones were copied to frame 0's SSBO and bonesDirty cleared, frame 1's newly-allocated SSBO would contain garbage/zeros and never get populated — causing animated M2 instances to flash invisible on alternating frames. Fix: Make bonesDirty per-frame-index (bool[2]) so each buffer independently tracks whether it needs bone data uploaded. When bones are recomputed, both indices are marked dirty. When uploaded during render, only the current frame index is cleared. New buffer allocations in prepareRender force their frame index dirty.	2026-03-07 22:47:07 -08:00
Kelsi	6cf08fbaa6	Throttle proactive tile streaming to reduce post-load hitching Add 2-second cooldown timer before re-checking for unloaded tiles when workers are idle, preventing excessive streamTiles() calls that caused frame hitches right after world load.	2026-03-07 22:40:07 -08:00
Kelsi	c13dbf2198	Proactive tile streaming, faster finalization, tree trunk collision - Re-check for unloaded tiles when workers are idle (no tile boundary needed) - Increase M2 upload budget 4→16 and WMO 1→4 per frame when not under pressure - Lower tree collision threshold from 40 to 6 units so large trees block movement	2026-03-07 22:35:18 -08:00
Kelsi	4cb03c38fe	Parallel animation updates, thread-safe collision, M2 pop-in fix, shadow stabilization - Overlap M2 and character animation updates via std::async (~2-5ms saved) - Thread-local collision scratch buffers for concurrent floor queries - Parallel terrain/WMO/M2 floor queries in camera controller - Seed new M2 instance bones from existing siblings to eliminate pop-in flash - Fix shadow flicker: snap center along stable light axes instead of in view space - Increase shadow distance default to 300 units (slider max 500)	2026-03-07 22:29:06 -08:00
Kelsi	a4966e486f	Fix WMO wall collision, normal mapping, POM backfill, and M2/WMO rendering performance - Fix MOPY flag check (0x08 not 0x01) for proper wall collision detection - Cap MAX_PUSH to PLAYER_RADIUS to prevent gradual clip-through - Fix WMO doodad quaternion component ordering (X/Y swap) - Linear normal map strength blend in shader for smooth slider control - Enable shadow sampling for interior WMO groups (covered outdoor areas) - Backfill deferred normal/height maps after streaming with descriptor rebind - M2: prepareRender only iterates animated instances, bone dirty flag - M2: remove worker thread VMA allocation, skip unready bone instances - WMO: persistent visibility vectors, sequential culling - Add FSR EASU/RCAS shaders	2026-03-07 22:03:28 -08:00
Kelsi	16c6c2b6a0	Raise diagnostic log thresholds to reduce log noise SLOW update stages: 3ms → 50ms, renderer update: 5ms → 50ms, loadModel/processAsync/spawnCreature: 3ms → 100ms, terrain/camera: 3-5ms → 50ms. Remove per-frame spawn breakdown.	2026-03-07 18:43:13 -08:00
Kelsi	02cf0e4df3	Background normal map generation, queue-draining load screen warmup - Normal map CPU work (luminance→blur→Sobel) moved to background threads, main thread only does GPU upload (~1-2ms vs 15-22ms per texture) - Load screen warmup now waits until ALL spawn/equipment/gameobject queues are drained before transitioning (prevents naked character, NPC pop-in) - Exit condition: min 2s + 5 consecutive empty iterations, hard cap 15s - Equipment queue processes 8 items per warmup iteration instead of 1 - Added LoadingScreen::renderOverlay() for future world-behind-loading use	2026-03-07 18:40:24 -08:00
Kelsi	63efac9fa6	Unlimited creature model uploads during load screen, remove duplicate code Loading screen now calls processCreatureSpawnQueue(unlimited=true) which removes the 1-upload-per-frame cap and 2ms time budget, allowing all pending creature models to upload to GPU in bulk. Also increases concurrent async background loads from 4 to 16 during load screen. Replaces 40-line inline duplicate of processAsyncCreatureResults with the shared function.	2026-03-07 17:31:47 -08:00
Kelsi	24f2ec75ec	Defer normal map generation to reduce GPU model upload stalls by ~50% Some checks are pending Build / Build (arm64) (push) Waiting to run Details Build / Build (x86-64) (push) Waiting to run Details Build / Build (macOS arm64) (push) Waiting to run Details Build / Build (windows-arm64) (push) Waiting to run Details Build / Build (windows-x86-64) (push) Waiting to run Details Security / CodeQL (C/C++) (push) Waiting to run Details Security / Semgrep (push) Waiting to run Details Security / Sanitizer Build (ASan/UBSan) (push) Waiting to run Details Each loadTexture call was generating a normal/height map inline (3 full-image passes: luminance + blur + Sobel). For models with 15-20 textures this added 30-40ms to the 70ms model upload. Now deferred to a per-frame budget (2/frame in-game, 10/frame during load screen). Models render without POM until their normal maps are ready.	2026-03-07 17:16:38 -08:00
Kelsi	faca22ac5f	Async humanoid NPC texture pipeline to eliminate 30-150ms main-thread stalls Move all DBC lookups (CharSections, ItemDisplayInfo), texture path resolution, and BLP decoding for humanoid NPCs to background threads. Only GPU texture uploads remain on the main thread via pre-decoded BLP cache.	2026-03-07 16:54:58 -08:00
Kelsi	7ac990cff4	Background BLP texture pre-decoding + deferred WMO normal maps (12x streaming perf) Move CPU-heavy BLP texture decoding from main thread to background worker threads for all hot paths: terrain M2 models, WMO doodad M2s, WMO textures, creature models, and gameobject WMOs. Each renderer (M2, WMO, Character) now accepts a pre-decoded BLP cache that loadTexture() checks before falling back to synchronous decode. Defer WMO normal/height map generation (3 per-pixel passes: luminance, box blur, Sobel) during terrain streaming finalization — this was the dominant remaining bottleneck after BLP pre-decoding. Terrain streaming stalls: 1576ms → 124ms worst case.	2026-03-07 15:46:56 -08:00
Kelsi	0313bd8692	Performance: ring buffer UBOs, batched load screen uploads, background world preloader - Replace per-frame VMA alloc/free of material UBOs with a ring buffer in CharacterRenderer (~500 allocations/frame eliminated) - Batch all ready terrain tiles into a single GPU upload during load screen (processAllReadyTiles instead of one-at-a-time with individual fence waits) - Lift per-frame creature/GO spawn budgets during load screen warmup phase - Add background world preloader: saves last world position to disk, pre-warms AssetManager file cache with ADT files starting at app init (login screen) so terrain workers get instant cache hits when Enter World is clicked - Distance-filter expensive collision guard to 8-unit melee range - Merge 3 CharacterRenderer update loops into single pass - Time-budget instrumentation for slow update stages (>3ms threshold) - Count-based async creature model upload budget (max 3/frame in-game) - 1-per-frame game object spawn + per-doodad time budget for transport loading - Use deque for creature spawn queue to avoid O(n) front-erase	2026-03-07 13:44:09 -08:00

1 2 3 4 5 ...

1075 commits