Skip to content

[Televiz] ProjectionLayer + in-loop frame protocol refactor#544

Open
farbod-nv wants to merge 4 commits into
mainfrom
fm/projectionlayer
Open

[Televiz] ProjectionLayer + in-loop frame protocol refactor#544
farbod-nv wants to merge 4 commits into
mainfrom
fm/projectionlayer

Conversation

@farbod-nv

@farbod-nv farbod-nv commented May 18, 2026

Copy link
Copy Markdown
Contributor

Adds ProjectionLayer for full-view RGBD content (gsplat, nvblox, neural reconstruction) and refactors VizSession's frame loop so the poses fed to renderers match the poses submitted to OpenXR.

ProjectionLayer

In-loop contract:

info = session.begin_frame()                # xrWaitFrame + Begin + LocateViews
color, depth = renderer.render(info.views)  # render against THIS frame's views
layer.submit(color, depth)                  # publish for this frame
session.end_frame()                         # composite + xrEndFrame

The runtime / CloudXR paces the app via xrWaitFrame; if the renderer takes 30 ms, the app runs at ~30 fps and the runtime compositor reprojects the last submitted frame at display rate.

Storage: per-slot (color, depth) DeviceImages (7-slot mailbox per QuadLayer's pattern). Fragment shader samples color + depth and writes gl_FragDepth so QuadLayer / future OverlayLayer Z-composite correctly in the shared RT. Stereo: paired (left, right) slot storage; submit() ships both eyes on one CUDA stream.

In kXr, a visible ProjectionLayer that doesn't submit for the current frame is skipped at record() time so stale RGBD never gets composited under a new projection-layer pose. The freshness gate is off in kWindow / kOffscreen where no XR pose mismatch is possible.

Single XrCompositionLayerProjection covers QuadLayers + ProjectionLayer together — CloudXR fast-paths when only ProjectionLayer is present, and squashes per-layer-timewarped when mixed (Monado's compute-shader layer accumulator handles this automatically).

VizSession frame-loop refactor

VizSession::begin_frame previously returned placeholder identity views; the real per-eye XR poses were only acquired later inside VizCompositor::render via backend_->begin_frame. That meant renderers calling submit against the returned FrameInfo were rendering against the wrong pose.

Moves backend frame acquisition into VizSession::begin_frame: acquired Frame is stored in current_backend_frame_ and consumed by end_frame. FrameInfo.views now carries the real xrLocateViews output, so the in-loop pattern above is pose-correct by construction.

VizCompositor::render takes the acquired Frame as a parameter; the existing FrameGuard still owns end_frame/abort_frame protocol balance.

Adds Frame::predicted_display_time_ns so XR's
last_frame_state_.predictedDisplayTime flows through to FrameInfo.predicted_display_time.

LayerBase::on_frame_begin

New virtual called from VizSession::begin_frame on every registered layer (default no-op). ProjectionLayer uses it to clear its submitted-this-frame freshness flag.

Minor changes

  • DeviceImage allows PixelFormat::kD32F (was rejected as "reserved for ProjectionLayer"). kD32F + mip_levels > 1 rejected explicitly.
  • Python bindings: expose ViewInfo and FrameInfo.views. No ProjectionViewSnapshot / predicted_views / get_last_frame_info — the renderer reads info.views returned by begin_frame.

Tests

  • 13 C++ Catch2 tests (4 unit + 9 GPU): config validation, slot/view allocation, idempotent destroy, submit shape/format/dimension validation, mono+depth mailbox advance, stereo paired submit, no-depth path, on_frame_begin/submitted-this-frame toggling.
  • 8 Python pytest tests covering the same surface plus the in-loop submit pattern via begin_frame/end_frame.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added ProjectionLayer for projecting stereo/mono RGB-D (color + depth) content into the visualization pipeline, with support for flexible format configurations and frame-by-frame submissions.
    • Exposed ProjectionLayer to Python with full API support including stereo/depth configuration and buffer submission methods.
  • Enhancements

    • Extended frame information with predicted display timing from the backend system.
    • Improved frame lifecycle management with new frame-begin callbacks for layers.

@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3b49db73-72ac-4792-ad88-d5fdcae8d230

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces ProjectionLayer, a new Vulkan compositing layer for stereo/mono RGBD (color + optional depth) rendering with CUDA interoperability. It refactors the session/compositor frame acquisition to pass frames explicitly rather than acquiring them internally, adds a new LayerBase::on_frame_begin() callback for per-frame CPU state management, and exposes the new functionality to Python with comprehensive tests covering configuration, submission validation, and render-loop semantics.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The PR introduces significant new functionality (ProjectionLayer implementation, frame lifecycle refactor, shader programs, Python bindings) across heterogeneous files with dense logic in submission, rendering, and frame synchronization paths. The frame lifecycle refactor affects the compositor core and session state management, requiring careful review of exception handling and RAII patterns. Multiple independent subsystems (Vulkan resource management, CUDA interop, descriptor set binding, pipeline selection) must be verified separately for correctness.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.64% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main changes: adding ProjectionLayer and refactoring the frame protocol, matching the changeset's scope and objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fm/projectionlayer

Comment @coderabbitai help to get the list of available commands.

@farbod-nv farbod-nv force-pushed the fm/projectionlayer branch from af669a5 to e1c8895 Compare May 18, 2026 22:14
@farbod-nv farbod-nv changed the title viz: ProjectionLayer + in-loop frame protocol refactor [Televiz] ProjectionLayer + in-loop frame protocol refactor Jun 8, 2026
@farbod-nv farbod-nv requested a review from junjeechao June 8, 2026 16:14
@farbod-nv farbod-nv marked this pull request as ready for review June 8, 2026 16:14

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/viz/layers_tests/cpp/test_projection_layer.cpp`:
- Around line 86-95: The GPU skip gate must also catch initialization failures:
wrap the Vulkan setup calls so that exceptions from VkContext::init({}) and
RenderTarget::create(...) are converted into SKIP(...) instead of letting them
throw; keep the initial is_gpu_available() check, then try/catch around ctx.init
and creation of target (referencing VkContext::init, RenderTarget::create,
RenderTarget::Config and Resolution) and call SKIP("...") with a brief message
when an exception is caught so the test suite skips cleanly on runners without
working Vulkan.

In `@src/viz/python/layers_bindings.cpp`:
- Around line 184-234: The binding for ProjectionLayer.submit in
layers_bindings.cpp currently throws std::runtime_error only for the left_color
nil case but lets std::invalid_argument from viz::ProjectionLayer::submit
propagate (which becomes Python ValueError), causing test mismatch; wrap the
call to self.submit(...) in a try/catch that catches std::invalid_argument
(thrown by the C++ submit/validation in src/viz/layers/cpp/projection_layer.cpp)
and rethrow as std::runtime_error so pybind11 converts it to Python
RuntimeError; update the catch to include the original exception message when
constructing the std::runtime_error to preserve error details.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9d2a40b9-036d-453c-95d3-6a3b59e9b9ba

📥 Commits

Reviewing files that changed from the base of the PR and between 3bdb119 and e1c8895.

📒 Files selected for processing (22)
  • src/viz/core/cpp/device_image.cpp
  • src/viz/layers/cpp/CMakeLists.txt
  • src/viz/layers/cpp/inc/viz/layers/projection_layer.hpp
  • src/viz/layers/cpp/projection_layer.cpp
  • src/viz/layers_tests/cpp/CMakeLists.txt
  • src/viz/layers_tests/cpp/test_projection_layer.cpp
  • src/viz/python/core_bindings.cpp
  • src/viz/python/layers_bindings.cpp
  • src/viz/python/session_bindings.cpp
  • src/viz/python/viz_init.py
  • src/viz/python_tests/test_projection_layer.py
  • src/viz/session/cpp/inc/viz/session/display_backend.hpp
  • src/viz/session/cpp/inc/viz/session/layer_base.hpp
  • src/viz/session/cpp/inc/viz/session/viz_compositor.hpp
  • src/viz/session/cpp/inc/viz/session/viz_session.hpp
  • src/viz/session/cpp/viz_compositor.cpp
  • src/viz/session/cpp/viz_session.cpp
  • src/viz/session/cpp/xr_backend.cpp
  • src/viz/shaders/cpp/CMakeLists.txt
  • src/viz/shaders/cpp/projection_layer.frag
  • src/viz/shaders/cpp/projection_layer.vert
  • src/viz/shaders/cpp/projection_layer_no_depth.frag

Comment thread src/viz/layers_tests/cpp/test_projection_layer.cpp
Comment thread src/viz/python/layers_bindings.cpp
Adds ProjectionLayer for full-view RGBD content (gsplat, nvblox,
neural reconstruction) and refactors VizSession's frame loop so the
poses fed to renderers match the poses submitted to OpenXR.

## ProjectionLayer

In-loop contract:

    info = session.begin_frame()                # xrWaitFrame + Begin + LocateViews
    color, depth = renderer.render(info.views)  # render against THIS frame's views
    layer.submit(color, depth)                  # publish for this frame
    session.end_frame()                         # composite + xrEndFrame

The runtime / CloudXR paces the app via xrWaitFrame; if the renderer
takes 30 ms, the app runs at ~30 fps and the runtime compositor
reprojects the last submitted frame at display rate.

Storage: per-slot (color, depth) DeviceImages (7-slot mailbox per
QuadLayer's pattern). Fragment shader samples color + depth and writes
gl_FragDepth so QuadLayer / future OverlayLayer Z-composite correctly
in the shared RT. Stereo: paired (left, right) slot storage; submit()
ships both eyes on one CUDA stream.

In kXr, a visible ProjectionLayer that doesn't submit for the current
frame is skipped at record() time so stale RGBD never gets composited
under a new projection-layer pose. The freshness gate is off in
kWindow / kOffscreen where no XR pose mismatch is possible.

Single XrCompositionLayerProjection covers QuadLayers + ProjectionLayer
together — CloudXR fast-paths when only ProjectionLayer is present,
and squashes per-layer-timewarped when mixed (Monado's compute-shader
layer accumulator handles this automatically).

## VizSession frame-loop refactor

``VizSession::begin_frame`` previously returned placeholder identity
views; the real per-eye XR poses were only acquired later inside
``VizCompositor::render`` via ``backend_->begin_frame``. That meant
renderers calling ``submit`` against the returned FrameInfo were
rendering against the wrong pose.

Moves backend frame acquisition into ``VizSession::begin_frame``:
acquired Frame is stored in ``current_backend_frame_`` and consumed
by ``end_frame``. ``FrameInfo.views`` now carries the real xrLocateViews
output, so the in-loop pattern above is pose-correct by construction.

VizCompositor::render takes the acquired Frame as a parameter; the
existing FrameGuard still owns end_frame/abort_frame protocol balance.

Adds Frame::predicted_display_time_ns so XR's
``last_frame_state_.predictedDisplayTime`` flows through to
``FrameInfo.predicted_display_time``.

## LayerBase::on_frame_begin

New virtual called from VizSession::begin_frame on every registered
layer (default no-op). ProjectionLayer uses it to clear its
submitted-this-frame freshness flag.

## Minor changes

- ``DeviceImage`` allows PixelFormat::kD32F (was rejected as
  "reserved for ProjectionLayer"). kD32F + mip_levels > 1 rejected
  explicitly.
- Python bindings: expose ``ViewInfo`` and ``FrameInfo.views``.
  No ``ProjectionViewSnapshot`` / ``predicted_views`` / ``get_last_frame_info``
  — the renderer reads ``info.views`` returned by ``begin_frame``.

## Tests

- 13 C++ Catch2 tests (4 unit + 9 GPU): config validation, slot/view
  allocation, idempotent destroy, submit shape/format/dimension
  validation, mono+depth mailbox advance, stereo paired submit,
  no-depth path, on_frame_begin/submitted-this-frame toggling.
- 8 Python pytest tests covering the same surface plus the in-loop
  submit pattern via begin_frame/end_frame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@farbod-nv farbod-nv force-pushed the fm/projectionlayer branch 2 times, most recently from 8ab3ba0 to bff088c Compare June 24, 2026 17:53
- ProjectionLayer is direct-present only: per-view (color, depth) copied
  straight into the OpenXR swapchains (like holohub xr_gsplat), no shared
  render target or gl_FragDepth. A VizSession holds either one ProjectionLayer
  or N QuadLayers, enforced in add_layer.
- DisplayBackend gains supports_direct()/record_direct()/
  recommended_view_resolution(); implemented for XR, window, offscreen.
- Depth DeviceImage stored as R32_SFLOAT (color), not D32_SFLOAT: depth-format
  images don't round-trip CUDA external-memory array interop (hardware depth
  compression yields garbage). XrBackend bridges R32_SFLOAT -> D32_SFLOAT depth
  swapchain via a staging buffer (image->buffer->image; float bits verbatim).
- Window direct path fills the sRGB render target and reuses the RT->swapchain
  blit (correct sRGB round-trip + BGRA channel order + scaling); prefer
  R8G8B8A8_SRGB surface.
- Drop the XR host-wait before xrEndFrame; pipeline like holohub and rely on
  OpenXR queue ordering, avoiding reprojection tearing from the added latency.
- Remove projection_layer shaders (no longer rendered through a pipeline).
ProjectionLayer is implemented (direct-present RGBD layer), so replace the
'coming soon' placeholders in the televiz guide and architecture overview
with the real config table, in-loop submit pattern, and the single-
ProjectionLayer-XOR-QuadLayers constraint.
@farbod-nv farbod-nv force-pushed the fm/projectionlayer branch from bff088c to bacb9fd Compare June 24, 2026 19:57
- LayerBase::vk_context() (overridden by ProjectionLayer/QuadLayer);
  VizSession::add_layer rejects a layer built from a different VkContext
  than the session's — its images/semaphores would be used on the wrong
  device/queue. Closes the C++ add_layer<L>(other_ctx, ...) hole; the
  Python add_*_layer bindings already use the session context.
- Session failure-path tests: foreign-context rejection, view_resolution
  vs recommended mismatch, single-ProjectionLayer-XOR-QuadLayers invariant.
- Drop the redundant defensive copy-extent clamps in the XR/offscreen
  direct paths (add_layer already guarantees the 1:1 size match) and make
  the validation comments concise.
- ProjectionLayer.submit binding: re-raise C++ invalid_argument (bad call
  shapes) as runtime_error so Python sees RuntimeError, matching the
  buffer-conversion errors and the Python tests.
@farbod-nv farbod-nv force-pushed the fm/projectionlayer branch from 00f4a51 to 1be8fd5 Compare June 24, 2026 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants