Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions src/singlepass/x64/codegen.h
Original file line number Diff line number Diff line change
Expand Up @@ -935,19 +935,15 @@ class X64OnePassCodeGenImpl
X64Reg::getRegRef<X64::I32>(BaseReg), 0,
Offset, getWASMTypeSize<SrcType>());

#ifdef ZEN_ENABLE_CPU_EXCEPTION
if (!Base.isImm() && (Offset >= INT32_MAX)) {
// when offset >= INT32_MAX, then will cause inst like mov edi, dword
// ptr[r13+edi-1].
if (!Base.isImm() && (Offset >= (uint32_t)INT32_MAX)) {
auto MemAddrReg = Layout.getScopedTemp<AddrType, ScopedTempReg2>();
_ mov(X64Reg::getRegRef<X64::I32>(MemAddrReg), Offset);
_ add(X64Reg::getRegRef<X64::I64>(MemAddrReg),
X64Reg::getRegRef<X64::I64>(BaseReg));
Comment on lines +938 to 942
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Offset >= (uint32_t)INT32_MAX threshold is slightly off for the disp32 sign-extension issue: sign-extension becomes negative starting at 0x80000000 (i.e., Offset > INT32_MAX). Using >= INT32_MAX will unnecessarily take the slower explicit-address path for Offset == 0x7fffffff, which is still representable as a positive disp32.

Copilot uses AI. Check for mistakes.
_ add(X64Reg::getRegRef<X64::I64>(MemAddrReg), ABI.getMemoryBaseReg());
Addr = asmjit::x86::Mem(X64Reg::getRegRef<X64::I64>(MemAddrReg), 0,
getWASMTypeSize(SrcType));
getWASMTypeSize<SrcType>());
}
Comment on lines +938 to 946
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The large-offset handling is gated on !Base.isImm(), but the disp32 sign-extension problem also affects the Base.isImm() addressing form (e.g., i32.const 0; *.load/store offset>=0x80000000). Additionally, the current Base.isImm() path clamps Offset to INT32_MAX, which would miscompile valid in-bounds accesses above 2GiB. Consider extending the 64-bit explicit-address computation to cover immediate bases as well (materialize the full effective address in a GP register and use [reg]), and remove the clamping-as-invalid approach.

Copilot uses AI. Check for mistakes.
#endif // ZEN_ENABLE_CPU_EXCEPTION

LoadOperatorImpl<X64DestType, X64SrcType, Sext>::emit(
ASM, X64Reg::getRegRef<X64DestType>(ValReg), Addr);
Expand Down Expand Up @@ -1014,15 +1010,23 @@ class X64OnePassCodeGenImpl
ZEN_ABORT();
}

// Addr = memoryBase + (in64) offset, so when offset < 0,
// the result i32 Addr works like add (2**32 + offset)
asmjit::x86::Mem Addr =
Base.isImm() ? asmjit::x86::Mem(ABI.getMemoryBaseReg(), Offset,
getWASMTypeSize<Type>())
: asmjit::x86::Mem(ABI.getMemoryBaseReg(),
X64Reg::getRegRef<X64::I32>(RegNum), 0,
Offset, getWASMTypeSize<Type>());

if (!Base.isImm() && (Offset >= (uint32_t)INT32_MAX)) {
auto MemAddrReg = Layout.getScopedTemp<AddrType, ScopedTempReg2>();
_ mov(X64Reg::getRegRef<X64::I32>(MemAddrReg), Offset);
_ add(X64Reg::getRegRef<X64::I64>(MemAddrReg),
X64Reg::getRegRef<X64::I64>(RegNum));
Comment on lines +1020 to +1024
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: Offset >= (uint32_t)INT32_MAX triggers the explicit-address slow path one value too early. The disp32 sign bit flips at 0x80000000 (Offset > INT32_MAX), while 0x7fffffff is still a safe positive disp32.

Copilot uses AI. Check for mistakes.
_ add(X64Reg::getRegRef<X64::I64>(MemAddrReg), ABI.getMemoryBaseReg());
Addr = asmjit::x86::Mem(X64Reg::getRegRef<X64::I64>(MemAddrReg), 0,
getWASMTypeSize<Type>());
}
Comment on lines +1020 to +1028
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the load path, this large-offset fix only runs when !Base.isImm(). Immediate-base memory ops with offset>=0x80000000 still rely on a disp32 encoding (or the existing INT32_MAX clamping), which can misaddress memory or incorrectly trap on valid accesses. It would be safer to apply the same explicit 64-bit address materialization for the Base.isImm() case too.

Copilot uses AI. Check for mistakes.

mov<X64Type, ScopedTempReg0>(Addr, Value);
}

Expand Down
44 changes: 44 additions & 0 deletions tests/wast/spec_extra/store_large_offset_false_positive_oob.wast
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
;; Test: f64.store with offset >= INT32_MAX on large memory should NOT trap.
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opening comment says "offset >= INT32_MAX" but the disp32 sign-extension hazard starts at 0x80000000 (i.e., INT32_MAX + 1). Consider updating the wording to avoid implying that 0x7fffffff is problematic.

Suggested change
;; Test: f64.store with offset >= INT32_MAX on large memory should NOT trap.
;; Test: f64.store with offset >= 0x80000000 (INT32_MAX + 1) on large memory should NOT trap.

Copilot uses AI. Check for mistakes.
;;
;; When offset >= 0x80000000, x86-64 disp32 sign-extends to a negative 64-bit
;; value, causing the effective address to go before MemBase. The JIT must
;; compute the full 64-bit address explicitly to avoid this.
;;
;; Effective address = memory.size(65131) + offset(4268353288) = 4268418419
;; Memory size = 65131 * 65536 = 4268425216
;; 4268418419 + 8 <= 4268425216 => in-bounds, should NOT trap.

(module
(memory (;0;) 65131)

;; dynamic base (memory.size result) + large offset: in-bounds store
(func (export "f64_store_dynamic_base_large_offset")
memory.size
f64.const -5.44203
f64.store offset=4268353288)

;; dynamic base via parameter + large offset: in-bounds store
(func (export "f64_store_param_base_large_offset") (param i32)
local.get 0
f64.const 1.0
f64.store offset=4268353288)
Comment on lines +14 to +24
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test exercises large offsets with dynamic bases; to fully cover the disp32 sign-extension regression, consider also adding a case where the base is an immediate constant (e.g., i32.const 0) with an offset>=0x80000000. That hits the Base.isImm() addressing form, which can behave differently in the JIT.

Copilot uses AI. Check for mistakes.

;; dynamic base (memory.size result) + large offset: in-bounds load
(func (export "f64_load_dynamic_base_large_offset") (result f64)
memory.size
f64.load offset=4268353288)

;; large offset store that IS out-of-bounds must still trap
(func (export "f64_store_large_offset_oob") (param i32)
local.get 0
f64.const 0.0
f64.store offset=4294967295)
)

;; These should all succeed without trapping
(assert_return (invoke "f64_store_dynamic_base_large_offset"))
(assert_return (invoke "f64_store_param_base_large_offset" (i32.const 65131)))
(assert_return (invoke "f64_load_dynamic_base_large_offset") (f64.const 1.0))

;; Large offset that actually exceeds memory should still trap
(assert_trap (invoke "f64_store_large_offset_oob" (i32.const 65131)) "out of bounds memory access")
Loading