Performance: ring buffer UBOs, batched load screen uploads, background world preloader

- Replace per-frame VMA alloc/free of material UBOs with a ring buffer in
  CharacterRenderer (~500 allocations/frame eliminated)
- Batch all ready terrain tiles into a single GPU upload during load screen
  (processAllReadyTiles instead of one-at-a-time with individual fence waits)
- Lift per-frame creature/GO spawn budgets during load screen warmup phase
- Add background world preloader: saves last world position to disk, pre-warms
  AssetManager file cache with ADT files starting at app init (login screen)
  so terrain workers get instant cache hits when Enter World is clicked
- Distance-filter expensive collision guard to 8-unit melee range
- Merge 3 CharacterRenderer update loops into single pass
- Time-budget instrumentation for slow update stages (>3ms threshold)
- Count-based async creature model upload budget (max 3/frame in-game)
- 1-per-frame game object spawn + per-doodad time budget for transport loading
- Use deque for creature spawn queue to avoid O(n) front-erase
This commit is contained in:
Kelsi 2026-03-07 13:44:09 -08:00
parent 71e8ed5b7d
commit 0313bd8692
7 changed files with 390 additions and 121 deletions

View file

@ -254,7 +254,14 @@ private:
VkDescriptorPool materialDescPools_[2] = {VK_NULL_HANDLE, VK_NULL_HANDLE};
VkDescriptorPool boneDescPool_ = VK_NULL_HANDLE;
uint32_t lastMaterialPoolResetFrame_ = 0xFFFFFFFFu;
std::vector<std::pair<VkBuffer, VmaAllocation>> transientMaterialUbos_[2];
// Material UBO ring buffer — pre-allocated per frame slot, sub-allocated each draw
VkBuffer materialRingBuffer_[2] = {VK_NULL_HANDLE, VK_NULL_HANDLE};
VmaAllocation materialRingAlloc_[2] = {VK_NULL_HANDLE, VK_NULL_HANDLE};
void* materialRingMapped_[2] = {nullptr, nullptr};
uint32_t materialRingOffset_[2] = {0, 0};
uint32_t materialUboAlignment_ = 256; // minUniformBufferOffsetAlignment
static constexpr uint32_t MATERIAL_RING_CAPACITY = 4096;
// Texture cache
struct TextureCacheEntry {