mirror of
https://github.com/Kelsidavis/WoWee.git
synced 2026-04-16 01:03:51 +00:00
Implement GPU-driven Hierarchical-Z occlusion culling for M2 doodads using a depth pyramid built from the previous frame's depth buffer. The cull shader projects bounding spheres via prevViewProj (temporal reprojection) and samples the HiZ pyramid to reject hidden objects before the main render pass. Key implementation details: - Separate early compute submission (beginSingleTimeCommands + fence wait) eliminates 2-frame visibility staleness - Conservative safeguards prevent false culls: screen-edge guard, full VP row-vector AABB projection (Cauchy-Schwarz), 50% sphere inflation, depth bias, mip+1, min screen size threshold, camera motion dampening (auto-disable on fast rotations), and per-instance previouslyVisible flag tracking - Graceful fallback to frustum-only culling if HiZ init fails Fix dark WMO interiors by gating shadow map sampling on isInterior==0 in the WMO fragment shader. Interior groups (flag 0x2000) now rely solely on pre-baked MOCV vertex-color lighting + MOHD ambient color. Disable interiorDarken globally (was incorrectly darkening outdoor M2s when camera was inside a WMO). Use isInsideInteriorWMO() instead of isInsideWMO() for correct indoor detection. New files: - hiz_system.hpp/cpp: pyramid image management, compute pipeline, descriptors, mip-chain build dispatch, resize handling - hiz_build.comp.glsl: MAX-depth 2x2 reduction compute shader - m2_cull_hiz.comp.glsl: frustum + HiZ occlusion cull compute shader - test_indoor_shadows.cpp: 14 unit tests for shadow/interior contracts Modified: - CullUniformsGPU expanded 128->272 bytes (HiZ params, viewProj, prevViewProj) - Depth buffer images gain VK_IMAGE_USAGE_SAMPLED_BIT for HiZ reads - wmo.frag.glsl: interior branch before unlit, shadow skip for 0x2000 - Render graph: hiz_build + compute_cull disabled (run in early compute) - .gitignore: ignore compiled .spv binaries - MEGA_BONE_MAX_INSTANCES: 2048 -> 4096 Signed-off-by: Pavel Okhlopkov <pavel.okhlopkov@flant.com>
57 lines
2.4 KiB
GLSL
57 lines
2.4 KiB
GLSL
#version 450
|
||
|
||
// Hierarchical-Z depth pyramid builder.
|
||
// Builds successive mip levels from the scene depth buffer.
|
||
// Each 2×2 block is reduced to its MAXIMUM depth (farthest/largest value).
|
||
// This is conservative for occlusion: an object is only culled when its nearest
|
||
// depth exceeds the farthest occluder depth in the pyramid region.
|
||
//
|
||
// Two modes controlled by push constant:
|
||
// mipLevel == 0: Sample from the source depth texture (mip 0 of the full-res depth).
|
||
// mipLevel > 0: Sample from the previous HiZ mip level.
|
||
|
||
layout(local_size_x = 8, local_size_y = 8) in;
|
||
|
||
// Source depth texture (full-resolution scene depth, or previous mip via same image)
|
||
layout(set = 0, binding = 0) uniform sampler2D srcDepth;
|
||
|
||
// Destination mip level (written as storage image)
|
||
layout(r32f, set = 0, binding = 1) uniform writeonly image2D dstMip;
|
||
|
||
layout(push_constant) uniform PushConstants {
|
||
ivec2 dstSize; // Width and height of the destination mip level
|
||
int mipLevel; // Current mip level being built (0 = from scene depth)
|
||
};
|
||
|
||
void main() {
|
||
ivec2 pos = ivec2(gl_GlobalInvocationID.xy);
|
||
if (pos.x >= dstSize.x || pos.y >= dstSize.y) return;
|
||
|
||
// Each output texel covers a 2×2 block of the source.
|
||
// Use texelFetch for precise texel access (no filtering).
|
||
ivec2 srcPos = pos * 2;
|
||
|
||
float d00, d10, d01, d11;
|
||
|
||
if (mipLevel == 0) {
|
||
// Sample from full-res scene depth (sampler2D, lod 0)
|
||
d00 = texelFetch(srcDepth, srcPos + ivec2(0, 0), 0).r;
|
||
d10 = texelFetch(srcDepth, srcPos + ivec2(1, 0), 0).r;
|
||
d01 = texelFetch(srcDepth, srcPos + ivec2(0, 1), 0).r;
|
||
d11 = texelFetch(srcDepth, srcPos + ivec2(1, 1), 0).r;
|
||
} else {
|
||
// Sample from previous HiZ mip level (mipLevel - 1)
|
||
d00 = texelFetch(srcDepth, srcPos + ivec2(0, 0), mipLevel - 1).r;
|
||
d10 = texelFetch(srcDepth, srcPos + ivec2(1, 0), mipLevel - 1).r;
|
||
d01 = texelFetch(srcDepth, srcPos + ivec2(0, 1), mipLevel - 1).r;
|
||
d11 = texelFetch(srcDepth, srcPos + ivec2(1, 1), mipLevel - 1).r;
|
||
}
|
||
|
||
// Conservative maximum (standard depth buffer: 0=near, 1=far).
|
||
// We store the farthest (largest) depth in each 2×2 block.
|
||
// An object is occluded only when its nearest depth > the farthest occluder
|
||
// depth in the covered screen region — guaranteeing it's behind EVERYTHING.
|
||
float maxDepth = max(max(d00, d10), max(d01, d11));
|
||
|
||
imageStore(dstMip, pos, vec4(maxDepth));
|
||
}
|