feat(xr): 实现 OpenXR 运行时会话控制与双目渲染管线#4
Conversation
## VR 功能实现清单 ### 1) OpenXR 生命周期与运行时会话控制(已实现) - 接入 OpenXR 实例/系统初始化、Vulkan 适配与会话管理。 - 支持运行时启动/停止 XR 会话,不再只依赖启动时固定状态。 - 会话状态变化可驱动渲染路径切换与资源重建触发。 ### 2) 双目立体渲染主链路(已实现) - World Pipeline 增加 eyeCount 与 StereoMode 调度机制。 - 多个世界模块支持按眼渲染(单实例多派发/双实例)并绑定每眼资源视图。 - 从单眼输出扩展为数组层输出,支持每眼图像处理与合成。 ### 3) 可见性遮罩(Visibility Mask,已实现) - 从 OpenXR 查询 visibility mask 网格。 - 生成每眼遮罩纹理并在 Ray Tracing 路径中使用。 - 非活动立体会话下可回退到全可见路径,避免错误遮挡。 ### 4) 追踪空间与世界空间映射(已实现) - VRSystem 维护头显姿态、双眼参数、世界旋转/平移偏移。 - 在 uniform 与 shader 中引入每眼 view/proj 偏移,完成 tracking space 到 game world 的映射。 - 支持重定位(recenter)并保持世界偏移一致性。 ### 5) 注视点相关渲染(已实现) - 接入眼动注视点(gazePoint/gazeValid)数据通路。 - Ray Tracing 中接入中心/外圈参数,实现注视点相关的外圈降采样填充逻辑。 ### 6) 控制器输入与交互(已实现) - OpenXR ActionSet/Action 创建与主流 profile 绑定(如 Touch/Index/Vive/simple)。 - 同步控制器位姿、触发器/握把/摇杆/按钮状态到 VRSystem。 - 提供 JNI 接口给上层获取头手姿态、FOV、推荐分辨率等数据。 ### 7) 触觉反馈(已实现) - 支持左右手 haptic 振动触发与停止接口。 ### 8) 性能数据(已实现) - 接入 CPU/GPU 帧时统计与 compositor 目标时间。 - 输出 FPS、headroom、丢帧计数等 VR 性能指标。 ### 9) 构建与设备选择链路(已实现) - CMake 增加 OpenXR 开关与依赖接入。 - Vulkan instance/device 允许注入 OpenXR 需要的扩展。 - 物理设备选择支持 OpenXR 指定设备优先。 ### 10) 桌面镜像与回退行为(已实现) - 支持将双目结果镜像到桌面窗口(SBS 路径)。 - OpenXR 初始化失败时可回退到非 VR 渲染路径。 ### 11) 会话切换后的管线重建联动(已实现) - 会话开/关、XR 分辨率变化会触发 needRecreate。 - 渲染链在 mono/stereo 间切换时会更新 eyeCount 与相关资源尺寸。 ### 12) 设备/会话信息查询接口(已实现) - 可查询系统名、会话状态、眼分辨率、地面高度等运行时信息。 - 提供世界位置/世界朝向设置与查询接口,支持重定位工作流。 ## 主要影响范围 - OpenXR 核心: - src/core/render/openxr_context.cpp - src/core/render/openxr_context.hpp - src/core/render/openxr_input.cpp - src/core/render/openxr_input.hpp - XR 桥接与状态: - src/core/middleware/com_radiance_client_proxy_vulkan_VRProxy.cpp - src/core/render/render_framework.cpp - src/core/render/render_framework.hpp - src/core/render/renderer.hpp - src/core/render/vr_system.cpp - src/core/render/vr_system.hpp - 双目渲染模块与着色器: - src/core/render/pipeline.cpp - src/core/render/pipeline.hpp - src/core/render/modules/world/* - src/shader/world/post_render/* - src/shader/world/ray_tracing/* - Vulkan/XR 引导: - CMakeLists.txt - src/core/CMakeLists.txt - src/core/vulkan/instance.* - src/core/vulkan/device.* - src/core/vulkan/physical_device.* - src/core/vulkan/image.*
There was a problem hiding this comment.
Pull request overview
该 PR 将渲染框架扩展为可在运行时启动/停止 OpenXR 会话,并把 world 渲染链路从单眼扩展为双目数组层输出,配套引入 per-eye view/proj、可见性遮罩、注视点相关的 foveated 渲染,以及与 Vulkan 设备/扩展注入的初始化桥接。
Changes:
- 新增 OpenXRContext/OpenXRInput/VRSystem,并在 Framework 中实现会话生命周期控制与 XR swapchain 提交/镜像输出。
- World pipeline 引入
eyeCount与 StereoMode 调度,核心模块(ray tracing / temporal / tone mapping / post / NRD / FSR3 / DLSS)按眼渲染与按层绑定资源视图。 - Vulkan 侧支持 OpenXR 所需 instance/device 扩展注入与物理设备覆盖选择,Image 支持 per-layer image view。
Reviewed changes
Copilot reviewed 47 out of 47 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/shader/world/ray_tracing/world.rgen | 增加 visibility mask early-out、按眼 view/proj、foveated block 渲染与 block-fill。 |
| src/shader/world/ray_tracing/end_portal.rchit | 增加 push constant 并改为按眼 view/proj 投影。 |
| src/shader/world/ray_tracing/end_gateway.rchit | 同上,按眼 view/proj 投影。 |
| src/shader/world/post_render/world_post_star.vert | 增加 eyeIndex push constant,按眼 view/proj 变换。 |
| src/shader/world/post_render/world_post.vert | 增加 eyeIndex push constant,按眼 view/proj 与坐标系变换。 |
| src/core/vulkan/physical_device.hpp | 增加静态 overrideDevice 以支持 OpenXR 指定物理设备。 |
| src/core/vulkan/physical_device.cpp | 在选择物理设备时优先使用 overrideDevice。 |
| src/core/vulkan/instance.hpp | 增加静态 extraExtensions 以注入 OpenXR instance 扩展。 |
| src/core/vulkan/instance.cpp | 创建 VkInstance 时合并 extraExtensions。 |
| src/core/vulkan/device.hpp | 增加静态 extraExtensions 以注入 OpenXR device 扩展。 |
| src/core/vulkan/device.cpp | 创建 VkDevice 时合并 extraExtensions。 |
| src/core/vulkan/image.hpp | 增加 per-layer view API(createPerLayerViews/perLayerView)及 layer subresource helper。 |
| src/core/vulkan/image.cpp | 实现 per-layer image view 创建与按层取 view。 |
| src/core/render/vr_system.hpp | 新增 VRSystem/控制器/性能/眼参数等数据结构与接口。 |
| src/core/render/vr_system.cpp | 实现投影/视图 offset、simulation 更新、OpenXR 更新、recenter。 |
| src/core/render/renderer.hpp | Renderer Options 增加 VR 开关与参数,并持有 VRSystem。 |
| src/core/render/render_framework.hpp | Framework 增加 OpenXRContext 持有与会话控制接口。 |
| src/core/render/render_framework.cpp | Framework 初始化分阶段 OpenXR、运行时 begin/end frame、XR blit、GPU/CPU 计时与重建联动。 |
| src/core/render/pipeline.hpp | WorldPipelineBuildParams 增加 eyeCount;WorldPipeline/Context 增加 eyeCount 状态。 |
| src/core/render/pipeline.cpp | 按 XR 会话状态决定 eyeCount;创建 array-layer render targets;StereoMode 调度与镜像 SBS blit。 |
| src/core/render/openxr_input.hpp | 新增 OpenXR 输入/动作/触觉/注视点接口声明。 |
| src/core/render/openxr_input.cpp | 实现 actions/bindings、每帧输入同步、触觉与注视点计算。 |
| src/core/render/openxr_context.hpp | 新增 OpenXR 生命周期/会话/Swapchain/visibility mask 查询接口。 |
| src/core/render/openxr_context.cpp | 实现 pre/post Vulkan 初始化、会话控制、swapchain acquire/release、begin/end frame、visibility mask 拉取。 |
| src/core/render/modules/world/world_module.hpp | 引入 StereoMode 与按眼渲染入口(render3D/renderEye/currentEyeIndex)。 |
| src/core/render/modules/world/tone_mapping/tone_mapping_module.hpp | ToneMapping 支持 stereo(按眼 framebuffer/descriptorTable)。 |
| src/core/render/modules/world/tone_mapping/tone_mapping_module.cpp | ToneMapping 实现 3D dispatch/按眼输出与 exposure 共享策略。 |
| src/core/render/modules/world/temporal_accumulation/temporal_accumulation_module.hpp | TemporalAccumulation 标注 stereoMode 并新增 renderEye。 |
| src/core/render/modules/world/temporal_accumulation/temporal_accumulation_module.cpp | TemporalAccumulation 资源按眼索引、renderEye 路径与按层 copy/blit。 |
| src/core/render/modules/world/ray_tracing/ray_tracing_module.hpp | Ray tracing push constant 增加 eyeIndex;descriptor table 改为 [frame][eye];新增 visibility mask 资源。 |
| src/core/render/modules/world/ray_tracing/ray_tracing_module.cpp | Ray tracing 按眼 dispatch、按层绑定输出、生成/上传 visibility mask 纹理并绑定到 shader。 |
| src/core/render/modules/world/post_render/post_render_module.hpp | PostRender 增加 eyeIndex push constant 并改为 per-eye tables/framebuffers。 |
| src/core/render/modules/world/post_render/post_render_module.cpp | PostRender 按眼渲染与 push constants;按层拷贝与 per-eye depth framebuffer。 |
| src/core/render/modules/world/nrd/nrd_module.hpp | NRD 改为每眼一个 wrapper;composition 增加 eyeIndex;中间资源按眼索引。 |
| src/core/render/modules/world/nrd/nrd_module.cpp | NRD per-eye wrapper/init/denoise;为 stereo 增加 viewZ 单层拷贝与 composition 按眼 dispatch。 |
| src/core/render/modules/world/fsr_upscaler/upscaler_module.hpp | FSR3 改为 per-eye upscaler + per-eye intermediates;新增 renderEye。 |
| src/core/render/modules/world/fsr_upscaler/upscaler_module.cpp | FSR3 per-eye dispatch;array-layer 输入拷贝到单层 intermediate;输出回写到目标 array layer。 |
| src/core/render/modules/world/dlss/dlss_wrapper.hpp | setResource 增加 viewIndex 以支持 array-layer view。 |
| src/core/render/modules/world/dlss/dlss_wrapper.cpp | DLSS resource 绑定使用指定 image view / layer range。 |
| src/core/render/modules/world/dlss/dlss_module.hpp | DLSS 改为 per-eye 实例并新增 renderEye。 |
| src/core/render/modules/world/dlss/dlss_module.cpp | DLSS per-eye 实例化/资源绑定/denoise,并按层处理 firstHitDepth。 |
| src/core/render/buffers.hpp | Buffers 增加 foveated 参数 setter 与内部字段。 |
| src/core/render/buffers.cpp | WorldUBO 增加 stereo/foveated 字段填充与 gaze center 写入。 |
| src/core/middleware/com_radiance_client_proxy_vulkan_VRProxy.cpp | 新增 JNI VRProxy:会话控制、参数设置、姿态/输入/性能查询、触觉与 recenter。 |
| src/core/CMakeLists.txt | core target 增加 OpenXR 依赖、include 与编译宏。 |
| src/common/shared.hpp | WorldUBO 增加 stereo/foveated 字段(eye offsets、ipd、foveated 参数)。 |
| CMakeLists.txt | 增加 OpenXR-SDK FetchContent 与构建开关。 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| vec2 center = resolution * worldUBO.foveatedCenter; | ||
| float halfDiag = length(center); | ||
| uint bs = worldUBO.foveatedOuterBlockSize; | ||
| // Use block origin distance so all pixels in a block make the same decision | ||
| ivec2 blockOrigin = (pixel / ivec2(bs)) * ivec2(bs); | ||
| float originDist = length(vec2(blockOrigin) + 0.5 - center) / halfDiag; | ||
| if (originDist > worldUBO.foveatedInnerRadius) { |
There was a problem hiding this comment.
Foveated rendering distance normalization uses halfDiag = length(center). When gaze moves away from screen center, this changes the denominator and can even become 0 (e.g. gaze at (0,0)), causing division-by-zero and overly aggressive block sizing. Use a constant normalization (e.g. half diagonal of the render target) or compute max distance from center to the corners, and guard against zero.
…and Update VRPerformanceStats
The final sprint of the day. Brings up real ray tracing against real
chunk geometry: per-chunk BLAS builds, full TLAS rebuild from BLAS
instances, an MVP shader trio, and a RayTracingAdapter::execute that
actually calls vkCmdTraceRaysKHR.
Acceleration structure services:
* blas_service: per-chunk bottom-level AS via
vkCmdBuildAccelerationStructuresKHR. Shared 32 MB scratch buffer
(no per-chunk allocation thrash). Post-build barrier
AS_BUILD_WRITE -> RT_SHADER_READ. Old BLAS released synchronously
on chunk re-upload (deferred-delete is risk #1 for follow-up).
No compaction yet (risk #3); ALLOW_COMPACTION_BIT not set.
* tlas_service: full-rebuild TLAS from BLAS instance list. Builds two
parallel BDA SSBOs alongside the TLAS so the closest-hit shader can
fetch vertex/index buffers via gl_InstanceCustomIndexEXT. Chunk-origin
SSBO is currently zero-filled - TODO at tlas_service.cpp:73 to thread
GpuChunkData.origin through BlasData (risk #4).
Ray tracing adapter execute side:
* raytracing_adapter.cpp: full implementation.
- init: loads v2_world.{rgen,rmiss,rchit}, builds RT pipeline via
vk2::RtPipeline + ShaderBindingTable.
- registerPass: declares output (RT_HDR_OUTPUT, rgba16f), no inputs.
- execute (per frame):
1. Allocate 4 descriptor sets via vk2::DescriptorAllocator
2. Write set 0 (TLAS), set 1 (vertex/index BDA SSBOs from TLAS),
set 2 (WorldUBO + SkyUBO from SceneResourceService),
set 3 (output storage image)
3. vkCmdBindPipeline + vkCmdBindDescriptorSets x4
4. vkCmdTraceRaysKHR(cmd, rgen, miss, hit, callable, w, h, 1)
- Per-frame full descriptor allocation costs scale with bindless slot
count - risk #9 for the 4096-texture case.
MVP shader trio (v2_min/):
* v2_world.rgen (94 lines): reads camera from WorldUBO, generates primary
rays through pixel center, calls traceRayEXT, writes rgba16f.
* v2_world.rmiss (31 lines): sky gradient from SkyUBO.horizonColor ->
SkyUBO.baseColor + simple sun-disc dot-product against SkyUBO.sunDir.
* v2_world.rchit (86 lines): gl_InstanceCustomIndexEXT -> per-instance
BDA SSBO -> uvec3 indices -> 3 PBRTriangle vertices -> barycentric
interpolation -> face normal -> Lambertian shading against sun
direction. No PBR/shadows/GI yet - those come with PR39+.
Closes PR33 (BLAS), PR34 (TLAS), PR35 (RT pipeline init), PR36 (first rays).
This is the buildable checkpoint - core.dll @ MCVR/bin/core.dll built
cleanly at 23:53:49 on 2026-04-06 from the working tree at this commit.
Remaining tactical work (per FUTURE-INSPECT.md follow-ups):
PR37 - denoiser (SVGF / TAA)
PR39 - real material shading (replaces v2_min Lambertian)
PR40 - DLSS-RR upscaler integration
PR41 - NRD denoiser integration
PR42 - Frame Gen integration
PR43 - V2 config snapshot wiring
PR44 - legacy renderer removal
Turns the V2 render graph from 'tracing into empty TLAS' into 'tracing real chunk geometry' by plumbing Minecraft's chunk mesh output through the V2 scene services. Also fixes risks #1 (use-after-free on chunk re-upload), #4 (chunk origin TODO), and #7 (garbage energy LUT) from the original 4-pass analysis. The big shift: chunks.cpp now tees the mesh worker output straight into CmdChunkSubmit after BlockMesher::mesh() returns. This path catches ~100% of in-game chunk uploads (the rebuildSingle fallback tee is still fixed but only fires on the slow path). ChunkId now 3D (x, sectionY, z): * scene_types.hpp: added sectionY field and updated std::hash * Was a latent design bug - 24 column sections collided on a single {x, z} key, so 23/24 of every column's data was silently dropped * replay_recorder.{hpp,cpp}: bumped binary version 1 -> 2 and added sectionY to on-disk record; older replays invalid (intentional) Deferred-delete via ResourceGC (risk #1 fix): * blas_service.{hpp,cpp}: init() now takes ResourceGC*; retireBlas() wraps VkAccelerationStructureKHR + vk2::Buffer in shared_ptr and defers destruction via gc->defer(lambda). Chunks re-meshed rapidly in the same pose will no longer free in-flight BLAS data. * tlas_service.{hpp,cpp}: same pattern for old TLAS buffers on rebuild, plus for the scratch buffer if its ever replaced mid-flight. * gpu_upload_service.{hpp,cpp}: same pattern for vertex/index buffers on chunk re-upload and removeChunk. * ResourceGC has a 32-frame ring, framesInFlight=2, so deferred destructors stay alive for at least 31 frames - plenty of slack. Chunk origin baked into TLAS instance transform (risk #4 fix): * blas_service.hpp: BlasData now caches originX/Y/Z as float (copied from GpuChunkData on build). * tlas_service.cpp: inst.transform.matrix now writes translation (originX, originY, originZ) in the column-3 slot instead of an identity matrix. The closest-hit shader gl_ObjectToWorldEXT automatically reflects this - no rchit changes needed. * Dropped chunkOriginBuffer_ and chunkOriginArraySize_ - origin is in the transform, no parallel SSBO needed. Resolves the tlas_service.cpp:73 TODO from the original session summary. Energy LUT initialized to 1.0 (risk #7 fix): * scene_resource_service.{hpp,cpp}: added pendingEnergyLutStaging_ member. init() creates a host-visible staging buffer filled with half-float 1.0 (0x3C00) covering all 64x64x4 channels. * New runDeferredInit(cmd) records vkCmdCopyBufferToImage on first call, with proper layout transitions UNDEFINED -> TRANSFER_DST -> SHADER_READ_ONLY. Staging buffer freed after copy. * Called from engine_app.cpp processScene() first frame only. Production chunk tee (chunks.cpp, +88 lines): * After BlockMesher::mesh() returns the meshOutput, copy solid + cutout + translucent PBR triangles into a merged vertex array with index re-offsetting for V2. * Post as CmdChunkSubmit via EngineServices::bridge(). * Only fires when useV2Engine is true and the bridge has been initialized (EngineApp::instance() != nullptr check). * This is the hot path - the rebuildSingle fallback path tee in ChunkProxy.cpp is kept working (with sectionY) for completeness. ChunkProxy.cpp fix: * The existing rebuildSingle tee was using {cx, cz} as the key with no Y component - all 24 column sections silently collided. Now derives sectionY from cmd.originY >> 4 and uses a proper 3D key. CmdChunkSubmit/CmdChunkRemove schema: * bridge_service.hpp: added int32 sectionY field to both command structs, placed between chunkZ and existing origin fields for locality. * engine_app.cpp: handleCommand uses the 3D key throughout. engine_app.cpp: * Scene service init() calls now pass &services_->frame().gc() so the deferred-delete wiring is functional. * processScene() first-frame path calls sceneRes().runDeferredInit(cmd) after scene UBO update but before RT adapter reads. * Chunk command handlers use ChunkId{x, sectionY, z} throughout. Smoke test results (in-game, 90 seconds of V2 gameplay): * Pre-fix baseline (empty TLAS): 1.65 ms avg, 651 fps, sky-only * After-fix with real chunks: 4.17 ms avg, 240 fps (vsync capped) * Chunks flowed: 30323 -> 31007 (684 new chunk meshes through tee) * Engine log: 99832 bytes, 0 errors, 0 warnings, 0 validation issues * Deferred-delete: confirmed by service init log 'deferred-delete: on' on all three scene services. * Process exit: clean in 1s, no deadlock The +2.5 ms per frame is real BLAS build + TLAS rebuild + closest-hit shader work replacing the previous sky-gradient miss-only path. The V2 graph has real headroom - it's hitting 240 Hz vsync, not the GPU ceiling. Closes risks 1, 4, 7 from the original 4-pass analysis. Unblocks PR37 (denoiser), PR39 (real materials), PR40 (DLSS-RR) which all need real RT output to validate against.
VR 功能实现清单
1) OpenXR 生命周期与运行时会话控制(已实现)
2) 双目立体渲染主链路(已实现)
3) 可见性遮罩(Visibility Mask,已实现)
4) 追踪空间与世界空间映射(已实现)
5) 注视点相关渲染(已实现)
6) 控制器输入与交互(已实现)
7) 触觉反馈(已实现)
8) 性能数据(已实现)
9) 构建与设备选择链路(已实现)
10) 桌面镜像与回退行为(已实现)
11) 会话切换后的管线重建联动(已实现)
12) 设备/会话信息查询接口(已实现)
主要影响范围
测试情况
9700x 9070xt上运行正常
仅测试了nrd和fsr3路径
设备不足,无法测试dlss路径