-
Notifications
You must be signed in to change notification settings - Fork 24
perf(evm): bypass virtual stack and cache InterpreterExecContext for … #419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -667,11 +667,34 @@ void Runtime::callWasmFunctionInInterpMode(Instance &Inst, uint32_t FuncIdx, | |||||||||||||||||||||||||||||||||||
| #ifdef ZEN_ENABLE_EVM | ||||||||||||||||||||||||||||||||||||
| void Runtime::callEVMInInterpMode(EVMInstance &Inst, evmc_message &Msg, | ||||||||||||||||||||||||||||||||||||
| evmc::Result &Result) { | ||||||||||||||||||||||||||||||||||||
| evm::InterpreterExecContext Ctx(&Inst); | ||||||||||||||||||||||||||||||||||||
| evm::BaseInterpreter Interpreter(Ctx); | ||||||||||||||||||||||||||||||||||||
| Ctx.allocTopFrame(&Msg); | ||||||||||||||||||||||||||||||||||||
| Interpreter.interpret(); | ||||||||||||||||||||||||||||||||||||
| Result = std::move(const_cast<evmc::Result &>(Ctx.getExeResult())); | ||||||||||||||||||||||||||||||||||||
| // Reuse a thread-local InterpreterExecContext for top-level calls to avoid | ||||||||||||||||||||||||||||||||||||
| // re-allocating the ~33 KB EVMFrame (1024 × uint256 stack) on every call. | ||||||||||||||||||||||||||||||||||||
| // For nested calls (CALL/CREATE re-entering this function via Host->call()), | ||||||||||||||||||||||||||||||||||||
| // we must create a fresh context to avoid corrupting the outer call's state. | ||||||||||||||||||||||||||||||||||||
| static thread_local evm::InterpreterExecContext *TLCtx = nullptr; | ||||||||||||||||||||||||||||||||||||
| static thread_local bool TLCtxInUse = false; | ||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||
| if (!TLCtxInUse) { | ||||||||||||||||||||||||||||||||||||
| // Top-level call: reuse the cached context | ||||||||||||||||||||||||||||||||||||
| if (!TLCtx) { | ||||||||||||||||||||||||||||||||||||
| TLCtx = new evm::InterpreterExecContext(&Inst); | ||||||||||||||||||||||||||||||||||||
| } else { | ||||||||||||||||||||||||||||||||||||
| TLCtx->resetForNewCall(&Inst); | ||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||
| TLCtxInUse = true; | ||||||||||||||||||||||||||||||||||||
| evm::BaseInterpreter Interpreter(*TLCtx); | ||||||||||||||||||||||||||||||||||||
| TLCtx->allocTopFrame(&Msg); | ||||||||||||||||||||||||||||||||||||
| Interpreter.interpret(); | ||||||||||||||||||||||||||||||||||||
| Result = std::move(const_cast<evmc::Result &>(TLCtx->getExeResult())); | ||||||||||||||||||||||||||||||||||||
| TLCtxInUse = false; | ||||||||||||||||||||||||||||||||||||
|
Comment on lines
+684
to
+689
|
||||||||||||||||||||||||||||||||||||
| TLCtxInUse = true; | |
| evm::BaseInterpreter Interpreter(*TLCtx); | |
| TLCtx->allocTopFrame(&Msg); | |
| Interpreter.interpret(); | |
| Result = std::move(const_cast<evmc::Result &>(TLCtx->getExeResult())); | |
| TLCtxInUse = false; | |
| struct TLCtxGuard { | |
| bool &Flag; | |
| explicit TLCtxGuard(bool &F) : Flag(F) { Flag = true; } | |
| ~TLCtxGuard() { Flag = false; } | |
| } Guard(TLCtxInUse); | |
| evm::BaseInterpreter Interpreter(*TLCtx); | |
| TLCtx->allocTopFrame(&Msg); | |
| Interpreter.interpret(); | |
| Result = std::move(const_cast<evmc::Result &>(TLCtx->getExeResult())); |
Copilot
AI
Mar 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says interpreter mode “manages call depth via InterpreterExecContext::FrameStack”, but CALL/CREATE in the interpreter go through Host->call(NewMsg) (see CallHandler::doExecute()), i.e. they re-enter execution rather than pushing additional frames onto FrameStack. Since FrameStack is currently only used for the single top frame (allocTopFrame()/freeBackFrame()), this rationale seems inaccurate/misleading—please reword to reflect the actual depth management mechanism (evmc_message.depth / host recursion).
| // Interpreter mode does not need a virtual stack: it manages call depth | |
| // via InterpreterExecContext::FrameStack and never emits native code that | |
| // could overflow the physical stack in an unbounded way. Skipping the | |
| // virtual-stack allocation/mprotect/setjmp round-trip on every call | |
| // eliminates ~50 % of the per-execution overhead measured on ERC-20 | |
| // transfers. | |
| // Interpreter mode does not need a virtual stack: CALL/CREATE re-enter | |
| // execution via the host with an incremented evmc_message.depth rather | |
| // than pushing additional frames onto InterpreterExecContext::FrameStack, | |
| // so call depth is bounded by the EVM depth limit rather than unbounded | |
| // native recursion. Skipping the virtual-stack allocation/mprotect/setjmp | |
| // round-trip on every call eliminates ~50 % of the per-execution overhead | |
| // measured on ERC-20 transfers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cached
InterpreterExecContextis allocated withnewinto athread_localraw pointer and is never freed. Even though it’s “only” once per thread, this is still an intentional leak and can matter for short-lived worker threads / fuzzing. Preferstatic thread_local std::unique_ptr<evm::InterpreterExecContext>(orstd::optionalif default-constructible) so the context is reclaimed on thread exit and ownership is explicit.