Skip to content

Conversation

@mfaferek93
Copy link
Collaborator

@mfaferek93 mfaferek93 commented Jan 17, 2026

Capture topic data when faults transition to CONFIRMED status, enabling post-mortem debugging of system state at the time of fault occurrence.

  • Add SnapshotCapture class with configurable topic resolution (fault_specific > patterns > default_topics priority)
  • Support on-demand and background capture modes
  • Store snapshots as JSON in SQLite with indexed fault_code lookup
  • Add GetSnapshots service to FaultManager
  • Add REST endpoints: GET /api/v1/faults/{code}/snapshots and GET /api/v1/components/{id}/faults/{code}/snapshots

Closes #81

Pull Request

Summary

Briefly describe what changed and why.


Issue

Link the related issue (required):


Type

  • Bug fix
  • New feature or tests
  • Breaking change
  • Documentation only

Testing

How was this tested / how should reviewers verify it?


Checklist

  • Breaking changes are clearly described (and announced in docs / changelog if needed)
  • Tests were added or updated if needed
  • Docs were updated if behavior or public API changed

Capture topic data when faults transition to CONFIRMED status, enabling
post-mortem debugging of system state at the time of fault occurrence.

- Add SnapshotCapture class with configurable topic resolution
  (fault_specific > patterns > default_topics priority)
- Support on-demand and background capture modes
- Store snapshots as JSON in SQLite with indexed fault_code lookup
- Add GetSnapshots service to FaultManager
- Add REST endpoints: GET /api/v1/faults/{code}/snapshots and
  GET /api/v1/components/{id}/faults/{code}/snapshots
- Add unit tests for snapshot storage operations

Closes #81
Copilot AI review requested due to automatic review settings January 17, 2026 13:40
@mfaferek93 mfaferek93 self-assigned this Jan 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds snapshot capture functionality to enable post-mortem debugging of system state when faults transition to CONFIRMED status. The implementation captures topic data (serialized as JSON) and stores it in SQLite, with REST API endpoints for retrieval.

Note: There is an inconsistency in the PR description - the title references issue #81, but the description's "Issue" section says "closes #82". Please verify and correct the issue number.

Changes:

  • Add SnapshotCapture class with configurable topic resolution (fault_specific > patterns > default_topics priority) and support for on-demand and background capture modes
  • Implement snapshot storage in SQLite with indexed lookup and in-memory storage for tests
  • Add GetSnapshots ROS 2 service and REST API endpoints for retrieving snapshots

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/ros2_medkit_msgs/srv/GetSnapshots.srv New service definition for retrieving fault snapshots with optional topic filtering
src/ros2_medkit_msgs/CMakeLists.txt Register GetSnapshots service in build system
src/ros2_medkit_gateway/src/http/rest_server.cpp Add two new REST endpoints for snapshot access (system-wide and component-scoped)
src/ros2_medkit_gateway/src/http/handlers/fault_handlers.cpp Implement HTTP handlers for snapshot retrieval with validation and error handling
src/ros2_medkit_gateway/src/fault_manager.cpp Add client-side implementation for GetSnapshots service calls
src/ros2_medkit_gateway/include/ros2_medkit_gateway/http/handlers/fault_handlers.hpp Add handler method declarations for snapshot endpoints
src/ros2_medkit_gateway/include/ros2_medkit_gateway/fault_manager.hpp Add get_snapshots method and service client member
src/ros2_medkit_fault_manager/test/test_sqlite_storage.cpp Add unit tests for snapshot storage operations (store, retrieve, filter, multiple)
src/ros2_medkit_fault_manager/src/sqlite_fault_storage.cpp Implement snapshot table schema and storage operations with indexes
src/ros2_medkit_fault_manager/src/snapshot_capture.cpp New class implementing topic capture logic with on-demand and background modes
src/ros2_medkit_fault_manager/src/fault_storage.cpp Add snapshot storage to in-memory implementation
src/ros2_medkit_fault_manager/src/fault_manager_node.cpp Integrate snapshot capture triggering on fault confirmation and add GetSnapshots service handler
src/ros2_medkit_fault_manager/package.xml Add dependencies for ros2_medkit_serialization and nlohmann-json-dev
src/ros2_medkit_fault_manager/include/ros2_medkit_fault_manager/sqlite_fault_storage.hpp Add snapshot method declarations to SQLite storage interface
src/ros2_medkit_fault_manager/include/ros2_medkit_fault_manager/snapshot_capture.hpp New header defining SnapshotCapture class and configuration structures
src/ros2_medkit_fault_manager/include/ros2_medkit_fault_manager/fault_storage.hpp Add SnapshotData struct and snapshot methods to storage interface
src/ros2_medkit_fault_manager/include/ros2_medkit_fault_manager/fault_manager_node.hpp Add snapshot capture member and service handler declaration
src/ros2_medkit_fault_manager/CMakeLists.txt Add snapshot_capture.cpp to build, link serialization and JSON libraries

@mfaferek93 mfaferek93 marked this pull request as draft January 17, 2026 13:58
- Add snapshot endpoints to Gateway README with examples
- Add 3 snapshot requests to Postman collection
- Add REQ_INTEROP_088 for fault snapshots in requirements
- Add note about SOVD "environment data" terminology mapping
- Add @verifies tags to snapshot storage tests
- Fix reentrancy: use dedicated callback group and local executor
  instead of spin_some on main node (avoids service callback issues)
- Improve regex handling: log ERROR for invalid patterns, track count
- Improve error message: specify 1-256 character limit for fault codes
- Fix captured_at: omit field when no snapshots exist (was epoch 0.0)
- Add yaml-cpp comment explaining transitive dependency
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

- Fix README: document correct response format (topics object, not array)
- Fix Postman: update description to reflect actual response structure
- Fix executor cleanup: use local callback group per capture operation
  (ensures clean lifecycle - both executor and callback group destroyed
  together when function returns)
- Add 14 unit tests for SnapshotCapture class (constructor, config, regex, topic resolution)
- Add 6 unit tests for FaultManager::get_snapshots client (timeout, JSON parsing, errors)
- Add 5 integration tests for REST snapshot endpoints (404/400 error handling)
- Add snapshot endpoints to root endpoint response
- Add snapshots.config_file parameter for loading fault_specific and patterns
- Implement load_snapshot_config_from_yaml() using yaml-cpp
- Add example config file (config/snapshots.yaml)
- Add 5 integration tests for snapshot capture functionality
- Tests verify capture on fault confirmation, error handling, and topic filtering
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 5 comments.

- Add validation for timeout_sec parameter (must be positive)
- Add validation for max_message_size parameter (must be positive)
- Document snapshots.config_file parameter in README
- Remove outdated TODO markers, add YAML config example
- Use assertGreater instead of assertTrue for better error messages
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated no new comments.

- Add snapshot cleanup to SqliteFaultStorage::clear_fault
- Add snapshot cleanup to InMemoryFaultStorage::clear_fault
- Add unit test verifying snapshots are deleted on clear
@mfaferek93 mfaferek93 requested a review from bburda January 17, 2026 17:09
- Add ~/get_snapshots service to Services table
- Add snapshot capture feature description
- Add Snapshot Parameters section with all config options
- Include YAML config example and topic resolution priority
@mfaferek93 mfaferek93 marked this pull request as ready for review January 17, 2026 17:32
- Create docs/tutorials/snapshots.rst with full configuration guide
- Cover quick start, advanced YAML config, querying via REST API
- Include example workflow and troubleshooting section
- Add to tutorials index in Basic Tutorials section
@mfaferek93 mfaferek93 requested a review from bburda January 17, 2026 20:01
@mfaferek93 mfaferek93 merged commit 89d3c70 into main Jan 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Snapshot Capture

3 participants