perf ticket 009: GPU-driven rendering — indirect multi-draw + GPU cull

Deferred perf ticket — see [docs/perf/009-gpu-driven-rendering.md](../blob/main/docs/perf/009-gpu-driven-rendering.md).

## Summary

Replace the scene graph's per-mesh CPU draw loop (one `set_bind_group` + `draw_indexed` per mesh, ~340 calls/frame on Sponza across shadow + main + depth-prepass passes) with a single `draw_indexed_indirect_count` call backed by a GPU-side frustum-cull compute pass. Collapses to one draw per render pass, regardless of mesh count.

## Why deferred

Pure **CPU-side** optimization on a **GPU-bound** benchmark. The perf README's own rule of thumb: "Sponza is GPU-bound, not CPU-bound. Don't chase CPU micro-optimizations expecting FPS improvement." Render-total CPU is already ~4 ms against a 16.7 ms vsync budget after the landed 001-017 wins (uniform pool, frustum cull, matrix-inverse cache, shadow cascade cache). Shaving another ~600 µs won't move FPS on Sponza — we'd be optimizing a resource we already have in surplus.

## Reopen criteria

- **A CPU-bound scene arrives** — 10 000+ mesh count, many small static props, or CPU-expensive per-frame state updates pushing `render_total` CPU past the vsync budget.
- **Ticket 008 (visibility buffer) reopens.** 008's shading pass needs a shared vertex/index buffer + per-mesh descriptor buffer — exactly what this ticket builds. Hard prerequisite in that direction.
- **Bindless texture support lands in wgpu.** The current "one `set_bind_group` per draw" pattern is partly about per-material texture binds. Bindless makes indirect multi-draw a straightforward win without the material-binding workarounds the ticket's notes describe.

## Effort

**~1 week** for the baseline `draw_indexed_indirect_count` path with GPU frustum cull. Material indirection still requires either bindless (not widely supported in wgpu 29) or a texture-array trick — that's where the risk sits, and why it's scoped at "week" not "days."

## Files

- `native/shared/src/renderer/mod.rs` — shared VB/IB, descriptor buffer, GPU cull compute shader, render pass using `draw_indexed_indirect_count`.
- `native/shared/src/scene.rs` — reworking of per-node GPU resources.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf ticket 009: GPU-driven rendering — indirect multi-draw + GPU cull #28

Summary

Why deferred

Reopen criteria

Effort

Files

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf ticket 009: GPU-driven rendering — indirect multi-draw + GPU cull #28

Description

Summary

Why deferred

Reopen criteria

Effort

Files

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions