Skip to content

Conversation

@brett-monad
Copy link
Contributor

No description provided.

Copilot AI review requested due to automatic review settings November 4, 2025 15:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors the LLVM backend to use a dependency-based intermediate representation instead of a virtual stack approach. The key changes include replacing the virtual stack with a dependency blocks IR that tracks instruction dependencies, moving LLVM initialization into a thread-safe singleton pattern, and adding support for disk-based caching of compiled LLVM code.

  • Introduced dependency blocks IR to replace virtual stack-based compilation
  • Moved LLVM initialization to thread-safe singleton with std::once_flag
  • Added file-based caching for compiled LLVM code
  • Removed the init_llvm() function from test code and integrated it into the compilation flow

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
category/vm/llvm/dependency_blocks.hpp New file implementing dependency-based IR for EVM bytecode compilation
category/vm/llvm/execute.cpp Added thread-safe LLVM initialization and disk loading functionality
category/vm/llvm/llvm.cpp Implemented file cache loading logic for compiled LLVM objects
category/vm/llvm/emitter.hpp Refactored to use dependency blocks IR instead of virtual stack
category/vm/llvm/llvm_state.hpp Added helper methods for disk loading and updated GEP signature
test/vm/vm/test_vm.cpp Removed manual init_llvm() calls as initialization is now automatic
category/vm/llvm/virtual_stack.hpp Removed file as it's replaced by dependency blocks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +64 to +74
evmc_revision const rev =
static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));
uint256_t const hash256 =
runtime::uint256_t::from_string("0x" + exts[1].substr(1));

evmc::bytes32 const code_hash = bytes32_from_uint256(hash256);

std::shared_ptr<LLVMState> const ptr =
monad::vm::llvm::load_from_disk(rev, entry.path().string());

cached_llvm_code_[rev].insert({code_hash, ptr});
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code assumes exts[0] starts with a specific character and calls substr(1) without validation. If the extension doesn't follow the expected format (e.g., it's empty or doesn't start with the expected character), this will cause undefined behavior or exceptions. Add validation before calling substr(1) to ensure the string has the expected format.

Suggested change
evmc_revision const rev =
static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));
uint256_t const hash256 =
runtime::uint256_t::from_string("0x" + exts[1].substr(1));
evmc::bytes32 const code_hash = bytes32_from_uint256(hash256);
std::shared_ptr<LLVMState> const ptr =
monad::vm::llvm::load_from_disk(rev, entry.path().string());
cached_llvm_code_[rev].insert({code_hash, ptr});
// Validate exts[0] and exts[1] before calling substr(1)
if (!exts[0].empty() && exts[0][0] == '.' &&
!exts[1].empty() && exts[1][0] == '.') {
evmc_revision const rev =
static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));
uint256_t const hash256 =
runtime::uint256_t::from_string("0x" + exts[1].substr(1));
evmc::bytes32 const code_hash = bytes32_from_uint256(hash256);
std::shared_ptr<LLVMState> const ptr =
monad::vm::llvm::load_from_disk(rev, entry.path().string());
cached_llvm_code_[rev].insert({code_hash, ptr});
}

Copilot uses AI. Check for mistakes.
evmc_revision const rev =
static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));
uint256_t const hash256 =
runtime::uint256_t::from_string("0x" + exts[1].substr(1));
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the previous issue, exts[1].substr(1) is called without validating that exts[1] has at least 2 characters. This can cause undefined behavior if the extension is empty or has only one character. Add validation before calling substr(1).

Copilot uses AI. Check for mistakes.
Comment on lines +180 to +182
InstrIdx unspill_offset_of(StackIdx i)
{
return (static_cast<InstrIdx>(-i - 1));
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formula -i - 1 for calculating unspill offset is not self-documenting and could be confusing. Consider adding a comment explaining the relationship between stack index and unspill instruction index, or using more descriptive intermediate variables to make the calculation clearer.

Suggested change
InstrIdx unspill_offset_of(StackIdx i)
{
return (static_cast<InstrIdx>(-i - 1));
// Converts a stack index to the corresponding unspill instruction index.
// The formula (-i - 1) is used because unspill instructions are indexed
// in reverse order relative to stack indices: stack index 0 maps to the
// last unspill instruction (-1), stack index 1 to (-2), etc.
InstrIdx unspill_offset_of(StackIdx i)
{
InstrIdx unspill_instr_idx = static_cast<InstrIdx>(-i - 1);
return unspill_instr_idx;

Copilot uses AI. Check for mistakes.
Comment on lines +547 to +548
// you can eliminate this subtraction by rewriting the jumptable
// address
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 547-548 suggests an optimization but doesn't explain how it would work or when it should be implemented. Either provide more detailed documentation about the suggested optimization or remove the comment if it's not actionable.

Suggested change
// you can eliminate this subtraction by rewriting the jumptable
// address

Copilot uses AI. Check for mistakes.
verifyModule(*llvm_module)); // BAL: this fails when O2 and
// O3 are applied. Why?
verifyModule(*llvm_module)); // this fails when O2 and O3
// are applied. Why?
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment indicates a known issue where module verification fails with O2/O3 optimization levels, but there's no explanation or tracking reference. This suggests potential correctness issues that should be investigated. Consider adding a TODO with context or a reference to a tracking issue.

Suggested change
// are applied. Why?
// are applied.
// TODO: Investigate why verifyModule fails when O2/O3 optimizations are applied.
// This may indicate a correctness issue in the optimization pipeline or module construction.
// Track this issue (create a GitHub issue or internal ticket) and link it here.

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +42
inline bool is_pure(OpCode op)
{
return (
op == Shl || op == Shr || op == Sar || op == Add || op == Mul ||
op == Sub || op == Div || op == SDiv || op == Mod || op == SMod ||
op == AddMod || op == MulMod || op == SignExtend || op == Lt ||
op == Gt || op == SLt || op == SGt || op == Eq || op == IsZero ||
op == And || op == Or || op == XOr || op == Not || op == Byte);

// should just put true or false in the OpCodeInfo table in opcodes.hpp
}
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 41 suggests this logic should be moved to the OpCodeInfo table. This would be a better design as it centralizes opcode metadata and avoids maintaining duplicate lists. Consider implementing this suggestion or creating a tracking issue for it.

Copilot uses AI. Check for mistakes.
Comment on lines +137 to +141
if (!std::filesystem::exists(nm_obj)) {
ObjectTransformLayer &otl = lljit->getObjTransformLayer();
DumpObjects const dumpobjs = DumpObjects("", nm);
otl.setTransform(dumpobjs);
}
Copy link

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The logic conditionally sets up object dumping only if the file doesn't exist. However, this creates an implicit side effect where the behavior changes based on file system state. Consider making this behavior more explicit or documenting why the check is necessary.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants