perf: eliminate ~70 unnecessary sqrt ops per frame, optimize caches and threading

Squared distance optimizations across 30 files:
- Convert glm::length() comparisons to glm::dot() (no sqrt)
- Use glm::inversesqrt() for check-then-normalize patterns (1 rsqrt vs 2 sqrt)
- Defer sqrt to after early-out checks in collision/movement code
- Hottest paths: camera_controller (21), weather particles, WMO collision,
  transport movement, creature interpolation, nameplate culling

Container and algorithm improvements:
- std::map<string> → std::unordered_map for asset/DBC/MPQ/warden caches
- std::mutex → std::shared_mutex for asset_manager and mpq_manager caches
- std::sort → std::partial_sort in lighting_manager (top-2 of N volumes)
- Double-lookup find()+operator[] → insert_or_assign in game_handler
- Add reserve() for per-frame vectors: weather, swim_effects, WMO/M2 collision

Threading and synchronization:
- Replace 1ms busy-wait polling with condition_variable in character_renderer
- Move timestamp capture before mutex in logger
- Use memory_order_acquire/release for normal map completion signaling

API additions:
- DBC getStringView()/getStringViewByOffset() for zero-copy string access
- Parse creature display IDs from SMSG_CREATURE_QUERY_SINGLE_RESPONSE
This commit is contained in:
Kelsi 2026-03-27 16:33:16 -07:00
parent cf0e2aa240
commit b0466e9029
29 changed files with 328 additions and 196 deletions

View file

@ -922,7 +922,8 @@ void WaterRenderer::loadFromWMO([[maybe_unused]] const pipeline::WMOLiquid& liqu
float stepXLen = glm::length(surface.stepX);
float stepYLen = glm::length(surface.stepY);
glm::vec3 planeN = glm::cross(surface.stepX, surface.stepY);
float nz = (glm::length(planeN) > 1e-4f) ? std::abs(glm::normalize(planeN).z) : 0.0f;
float planeNLenSq = glm::dot(planeN, planeN);
float nz = (planeNLenSq > 1e-8f) ? std::abs(planeN.z * glm::inversesqrt(planeNLenSq)) : 0.0f;
float spanX = stepXLen * static_cast<float>(surface.width);
float spanY = stepYLen * static_cast<float>(surface.height);
if (stepXLen < 0.2f || stepXLen > 12.0f ||