Bletner/llvm include sort bug #1874

brett-monad · 2025-11-04T15:43:19Z

No description provided.

Copilot

Pull Request Overview

This pull request refactors the LLVM backend to use a dependency-based intermediate representation instead of a virtual stack approach. The key changes include replacing the virtual stack with a dependency blocks IR that tracks instruction dependencies, moving LLVM initialization into a thread-safe singleton pattern, and adding support for disk-based caching of compiled LLVM code.

Introduced dependency blocks IR to replace virtual stack-based compilation
Moved LLVM initialization to thread-safe singleton with std::once_flag
Added file-based caching for compiled LLVM code
Removed the init_llvm() function from test code and integrated it into the compilation flow

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
category/vm/llvm/dependency_blocks.hpp	New file implementing dependency-based IR for EVM bytecode compilation
category/vm/llvm/execute.cpp	Added thread-safe LLVM initialization and disk loading functionality
category/vm/llvm/llvm.cpp	Implemented file cache loading logic for compiled LLVM objects
category/vm/llvm/emitter.hpp	Refactored to use dependency blocks IR instead of virtual stack
category/vm/llvm/llvm_state.hpp	Added helper methods for disk loading and updated GEP signature
test/vm/vm/test_vm.cpp	Removed manual `init_llvm()` calls as initialization is now automatic
category/vm/llvm/virtual_stack.hpp	Removed file as it's replaced by dependency blocks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-04T15:46:21Z

category/vm/llvm/llvm.cpp

+                evmc_revision const rev =
+                    static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));
+                uint256_t const hash256 =
+                    runtime::uint256_t::from_string("0x" + exts[1].substr(1));
+
+                evmc::bytes32 const code_hash = bytes32_from_uint256(hash256);
+
+                std::shared_ptr<LLVMState> const ptr =
+                    monad::vm::llvm::load_from_disk(rev, entry.path().string());
+
+                cached_llvm_code_[rev].insert({code_hash, ptr});


The code assumes exts[0] starts with a specific character and calls substr(1) without validation. If the extension doesn't follow the expected format (e.g., it's empty or doesn't start with the expected character), this will cause undefined behavior or exceptions. Add validation before calling substr(1) to ensure the string has the expected format.

Suggested change

evmc_revision const rev =

static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));

uint256_t const hash256 =

runtime::uint256_t::from_string("0x" + exts[1].substr(1));

evmc::bytes32 const code_hash = bytes32_from_uint256(hash256);

std::shared_ptr<LLVMState> const ptr =

monad::vm::llvm::load_from_disk(rev, entry.path().string());

cached_llvm_code_[rev].insert({code_hash, ptr});

// Validate exts[0] and exts[1] before calling substr(1)

if (!exts[0].empty() && exts[0][0] == '.' &&

!exts[1].empty() && exts[1][0] == '.') {

evmc_revision const rev =

static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));

uint256_t const hash256 =

runtime::uint256_t::from_string("0x" + exts[1].substr(1));

evmc::bytes32 const code_hash = bytes32_from_uint256(hash256);

std::shared_ptr<LLVMState> const ptr =

monad::vm::llvm::load_from_disk(rev, entry.path().string());

cached_llvm_code_[rev].insert({code_hash, ptr});

}

Copilot · 2025-11-04T15:46:21Z

category/vm/llvm/llvm.cpp

+                evmc_revision const rev =
+                    static_cast<evmc_revision>(std::stoi(exts[0].substr(1)));
+                uint256_t const hash256 =
+                    runtime::uint256_t::from_string("0x" + exts[1].substr(1));


Similar to the previous issue, exts[1].substr(1) is called without validating that exts[1] has at least 2 characters. This can cause undefined behavior if the extension is empty or has only one character. Add validation before calling substr(1).

Copilot · 2025-11-04T15:46:22Z

category/vm/llvm/dependency_blocks.hpp

+        InstrIdx unspill_offset_of(StackIdx i)
+        {
+            return (static_cast<InstrIdx>(-i - 1));


The formula -i - 1 for calculating unspill offset is not self-documenting and could be confusing. Consider adding a comment explaining the relationship between stack index and unspill instruction index, or using more descriptive intermediate variables to make the calculation clearer.

Suggested change

InstrIdx unspill_offset_of(StackIdx i)

{

return (static_cast<InstrIdx>(-i - 1));

// Converts a stack index to the corresponding unspill instruction index.

// The formula (-i - 1) is used because unspill instructions are indexed

// in reverse order relative to stack indices: stack index 0 maps to the

// last unspill instruction (-1), stack index 1 to (-2), etc.

InstrIdx unspill_offset_of(StackIdx i)

{

InstrIdx unspill_instr_idx = static_cast<InstrIdx>(-i - 1);

return unspill_instr_idx;

Copilot · 2025-11-04T15:46:22Z

category/vm/llvm/emitter.hpp

+                // you can eliminate this subtraction by rewriting the jumptable
+                // address


The comment on line 547-548 suggests an optimization but doesn't explain how it would work or when it should be implemented. Either provide more detailed documentation about the suggested optimization or remove the comment if it's not actionable.

Suggested change

// you can eliminate this subtraction by rewriting the jumptable

// address

Copilot · 2025-11-04T15:46:23Z

category/vm/llvm/llvm_state.hpp

-                    verifyModule(*llvm_module)); // BAL: this fails when O2 and
-                                                 // O3 are applied.  Why?
+                    verifyModule(*llvm_module)); // this fails when O2 and O3
+                                                 // are applied.  Why?


The comment indicates a known issue where module verification fails with O2/O3 optimization levels, but there's no explanation or tracking reference. This suggests potential correctness issues that should be investigated. Consider adding a TODO with context or a reference to a tracking issue.

Suggested change

// are applied. Why?

// are applied.

// TODO: Investigate why verifyModule fails when O2/O3 optimizations are applied.

// This may indicate a correctness issue in the optimization pipeline or module construction.

// Track this issue (create a GitHub issue or internal ticket) and link it here.

Copilot · 2025-11-04T15:46:23Z

category/vm/llvm/dependency_blocks.hpp

+inline bool is_pure(OpCode op)
+{
+    return (
+        op == Shl || op == Shr || op == Sar || op == Add || op == Mul ||
+        op == Sub || op == Div || op == SDiv || op == Mod || op == SMod ||
+        op == AddMod || op == MulMod || op == SignExtend || op == Lt ||
+        op == Gt || op == SLt || op == SGt || op == Eq || op == IsZero ||
+        op == And || op == Or || op == XOr || op == Not || op == Byte);
+
+    // should just put true or false in the OpCodeInfo table in opcodes.hpp
+}


The comment on line 41 suggests this logic should be moved to the OpCodeInfo table. This would be a better design as it centralizes opcode metadata and avoids maintaining duplicate lists. Consider implementing this suggestion or creating a tracking issue for it.

Copilot · 2025-11-04T15:46:23Z

category/vm/llvm/llvm_state.hpp

+                if (!std::filesystem::exists(nm_obj)) {
+                    ObjectTransformLayer &otl = lljit->getObjTransformLayer();
+                    DumpObjects const dumpobjs = DumpObjects("", nm);
+                    otl.setTransform(dumpobjs);
+                }


[nitpick] The logic conditionally sets up object dumping only if the file doesn't exist. However, this creates an implicit side effect where the behavior changes based on file system state. Consider making this behavior more explicit or documenting why the check is necessary.

brett-monad added 2 commits November 4, 2025 09:04

using dependency ir on llvm target

93716f9

original - compatible with clang-format-19

383bfe9

Copilot AI review requested due to automatic review settings November 4, 2025 15:43

brett-monad requested review from Baltoli and andreaslyn as code owners November 4, 2025 15:43

Copilot AI reviewed Nov 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bletner/llvm include sort bug #1874

Bletner/llvm include sort bug #1874

Uh oh!

brett-monad commented Nov 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Copilot AI Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        InstrIdx unspill_offset_of(StackIdx i)
-        {
-            return (static_cast<InstrIdx>(-i - 1));
+        // Converts a stack index to the corresponding unspill instruction index.
+        // The formula (-i - 1) is used because unspill instructions are indexed
+        // in reverse order relative to stack indices: stack index 0 maps to the
+        // last unspill instruction (-1), stack index 1 to (-2), etc.
+        InstrIdx unspill_offset_of(StackIdx i)
+        {
+            InstrIdx unspill_instr_idx = static_cast<InstrIdx>(-i - 1);
+            return unspill_instr_idx;

		// you can eliminate this subtraction by rewriting the jumptable
		// address

-                                                 // are applied.  Why?
+                                                 // are applied.
+                    // TODO: Investigate why verifyModule fails when O2/O3 optimizations are applied.
+                    // This may indicate a correctness issue in the optimization pipeline or module construction.
+                    // Track this issue (create a GitHub issue or internal ticket) and link it here.

Bletner/llvm include sort bug #1874

Are you sure you want to change the base?

Bletner/llvm include sort bug #1874

Uh oh!

Conversation

brett-monad commented Nov 4, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants