-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat: Add trace replay support for BroadcastWrite operator to presto_cpp #26690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideAdds trace replay support for the BroadcastWrite operator in presto_cpp by introducing a dedicated BroadcastWriteReplayer, wiring it into the trace runner and server trace-node factory, and providing comprehensive tests (including multi-driver scenarios and configurable output directory). Sequence diagram for BroadcastWrite trace replay with configurable output directorysequenceDiagram
actor Developer
participant TraceReplayerMain
participant PrestoTraceReplayRunner
participant BroadcastWriteReplayer
participant BroadcastWriteNode
participant FileSystem
Developer->>TraceReplayerMain: Start trace_replayer with node_name=BroadcastWrite and broadcast_write_output_dir
TraceReplayerMain->>PrestoTraceReplayRunner: create Runner with FLAGS including broadcast_write_output_dir
Developer->>PrestoTraceReplayRunner: run()
PrestoTraceReplayRunner->>PrestoTraceReplayRunner: inspect nodeName
PrestoTraceReplayRunner->>PrestoTraceReplayRunner: VELOX_USER_CHECK broadcast_write_output_dir not empty
PrestoTraceReplayRunner->>BroadcastWriteReplayer: construct BroadcastWriteReplayer(traceDir, queryId, taskId, nodeId, nodeName, driverIds, queryCapacity, executor, replayOutputDir)
PrestoTraceReplayRunner->>BroadcastWriteReplayer: createPlanNode(node, nodeId, source)
BroadcastWriteReplayer->>BroadcastWriteReplayer: dynamic_cast to facebook::presto::operators::BroadcastWriteNode
BroadcastWriteReplayer->>BroadcastWriteNode: construct BroadcastWriteNode(nodeId, replayOutputDir, maxBroadcastBytes, serdeRowType, source)
PrestoTraceReplayRunner->>BroadcastWriteNode: execute replayed plan
BroadcastWriteNode->>FileSystem: write broadcast data under replayOutputDir
FileSystem-->>BroadcastWriteNode: data persisted for inspection
BroadcastWriteNode-->>Developer: replayed broadcast data available in output directory
Updated class diagram for BroadcastWriteReplayer and related trace replay classesclassDiagram
namespace facebook_velox_tool_trace {
class OperatorReplayerBase {
<<abstract>>
+OperatorReplayerBase(string traceDir, string queryId, string taskId, string nodeId, string nodeName, string extraInfo, string driverIds, uint64_t queryCapacity, folly_Executor* executor)
+core_PlanNodePtr createPlan()*
+void run()
#core_PlanNodePtr createPlanNode(const core_PlanNode* node, const core_PlanNodeId& nodeId, const core_PlanNodePtr& source) *
}
class BroadcastWriteReplayer {
+BroadcastWriteReplayer(string traceDir, string queryId, string taskId, string nodeId, string nodeName, string driverIds, uint64_t queryCapacity, folly_Executor* executor, string replayOutputDir)
-core_PlanNodePtr createPlanNode(const core_PlanNode* node, const core_PlanNodeId& nodeId, const core_PlanNodePtr& source) const
-string replayOutputDir_
}
}
namespace facebook_presto_operators {
class BroadcastWriteNode {
+BroadcastWriteNode(core_PlanNodeId id, string basePath, uint64_t maxBroadcastBytes, TypePtr serdeRowType, core_PlanNodePtr source)
+string basePath()
+uint64_t maxBroadcastBytes()
+TypePtr serdeRowType()
+vector~core_PlanNodePtr~ sources()
}
}
namespace facebook_velox_core {
class PlanNode {
<<abstract>>
+core_PlanNodeId id()
+RowTypePtr outputType()
+vector~core_PlanNodePtr~ sources()
}
}
namespace facebook_velox_exec_trace {
class DummySourceNode {
+DummySourceNode(RowTypePtr outputType)
}
}
class PrestoTraceReplayRunner {
+PrestoTraceReplayRunner()
+unique_ptr~OperatorReplayerBase~ createOperatorReplayer(string nodeName)
}
class PrestoServer {
+void registerTraceNodeFactories()
}
facebook_velox_tool_trace_OperatorReplayerBase <|-- facebook_velox_tool_trace_BroadcastWriteReplayer
facebook_velox_core_PlanNode <|-- facebook_presto_operators_BroadcastWriteNode
facebook_velox_core_PlanNode <|-- facebook_velox_exec_trace_DummySourceNode
PrestoTraceReplayRunner ..> facebook_velox_tool_trace_BroadcastWriteReplayer : creates
facebook_velox_tool_trace_BroadcastWriteReplayer ..> facebook_presto_operators_BroadcastWriteNode : constructs in createPlanNode
facebook_presto_operators_BroadcastWriteNode o--> facebook_velox_core_PlanNode : source
facebook_presto_operators_BroadcastWriteNode ..> facebook_velox_exec_trace_DummySourceNode : used in trace factory
PrestoServer ..> facebook_presto_operators_BroadcastWriteNode : registers trace node factory
PrestoServer ..> facebook_velox_exec_trace_DummySourceNode : uses as dummy source
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The BroadcastWrite trace-node registration logic is duplicated between PrestoServer::registerTraceNodeFactories and BroadcastWriteReplayerTest::SetUpTestCase; consider factoring this into a shared helper to avoid future drift.
- In TraceReplayerMain, the broadcast_write_output_dir flag is only checked for non-emptiness; it may be more robust to also validate that the directory exists (or can be created) before running the replay.
- The tests modify the global operator registry (registerOperator/unregisterAllOperators); consider using an RAII-style guard or restoring previous state after each test to avoid surprising interactions with other tests that rely on operator registration.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The BroadcastWrite trace-node registration logic is duplicated between PrestoServer::registerTraceNodeFactories and BroadcastWriteReplayerTest::SetUpTestCase; consider factoring this into a shared helper to avoid future drift.
- In TraceReplayerMain, the broadcast_write_output_dir flag is only checked for non-emptiness; it may be more robust to also validate that the directory exists (or can be created) before running the replay.
- The tests modify the global operator registry (registerOperator/unregisterAllOperators); consider using an RAII-style guard or restoring previous state after each test to avoid surprising interactions with other tests that rely on operator registration.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/PrestoServer.cpp:1818-1825` </location>
<code_context>
+ "BroadcastWrite",
+ [](const velox::core::PlanNode* traceNode,
+ const velox::core::PlanNodeId& nodeId) -> velox::core::PlanNodePtr {
+ if (const auto* broadcastWriteNode =
+ dynamic_cast<const operators::BroadcastWriteNode*>(traceNode)) {
+ return std::make_shared<operators::BroadcastWriteNode>(
+ nodeId,
+ broadcastWriteNode->basePath(),
+ broadcastWriteNode->maxBroadcastBytes(),
+ broadcastWriteNode->serdeRowType(),
+ std::make_shared<velox::exec::trace::DummySourceNode>(
+ broadcastWriteNode->sources().front()->outputType()));
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against unexpected empty sources when constructing DummySourceNode.
This assumes `broadcastWriteNode->sources()` is non-empty and calls `.front()` directly. In malformed or partially recorded traces this could be empty, leading to undefined behavior or a crash. Please add a defensive check (e.g., `VELOX_CHECK(!broadcastWriteNode->sources().empty())`) or otherwise handle the empty case so trace replay fails with a clear error.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if (const auto* broadcastWriteNode = | ||
| dynamic_cast<const operators::BroadcastWriteNode*>(traceNode)) { | ||
| return std::make_shared<operators::BroadcastWriteNode>( | ||
| nodeId, | ||
| broadcastWriteNode->basePath(), | ||
| broadcastWriteNode->maxBroadcastBytes(), | ||
| broadcastWriteNode->serdeRowType(), | ||
| std::make_shared<velox::exec::trace::DummySourceNode>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Guard against unexpected empty sources when constructing DummySourceNode.
This assumes broadcastWriteNode->sources() is non-empty and calls .front() directly. In malformed or partially recorded traces this could be empty, leading to undefined behavior or a crash. Please add a defensive check (e.g., VELOX_CHECK(!broadcastWriteNode->sources().empty())) or otherwise handle the empty case so trace replay fails with a clear error.
xiaoxmeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@han-yan01 please make sure it is e2e working before land.
…cpp (prestodb#26690) Summary: Add trace replay support for BroadcastWrite operator to presto_cpp, instead of ws as intermediate storage, allow user-defined path for sink data inspection # Release Notes ``` == NO RELEASE NOTE == ``` Reviewed By: xiaoxmeng Differential Revision: D87790828
04ebc10 to
41c5de3
Compare
Summary:
Add trace replay support for BroadcastWrite operator to presto_cpp, instead of ws as intermediate storage, allow user-defined path for sink data inspection
Release Notes
Differential Revision: D87790828