Skip to content

Commit

Permalink
make the non-determinism story more explicit
Browse files Browse the repository at this point in the history
  • Loading branch information
RalfJung committed Aug 5, 2022
1 parent 09be122 commit 36c2666
Show file tree
Hide file tree
Showing 7 changed files with 57 additions and 41 deletions.
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,23 +27,35 @@ imagine Rust without all the restrictions about sizendess and pointer indirectio
We use generic type names like `List`, `Map`, `Set` rather than concrete implementations like `Vec`, `HashMap`, `HashSet`, since the implementation details do not matter.
Also, all types except for mutable references are `Copy` (let's just imagine we implicitly `Clone` where needed), and we use `fn(T) -> U` notation even for closures that can capture arbitrarily.
We also assume some "obvious" language extensions -- basically, it should always be clear what is meant to anyone with some Rust experience, even if this is not actually legal Rust.
We use `panic!` (and `unwrap` and slice indexing similar standard Rust operations) to indicate conditions that should always hold; if execution ever panics, that is a bug in the specification.

Our functions are generally pure; they can write to mutable reference but we can consider this to be implemented via explicit state passing.
When we do need other effects, we make them explicit in the return type.
The next sections describe the effects used in MiniRust.

### Fallible operations

We use `Result` to make operations fallible (where failure indicates UB or machine termination).
The bodies of `Result` and `Option`-returning functions behave like `try` blocks (implicit `Ok`/`Some` wrapping, and `throw!()` to return an error value (`Err`/`None`), like the experimental `yeet`).
We use a `throw_ub!` macro to make the current function return a UB error value, and `throw_machine_stop!` to indicate that and how the machine has stopped.
See [the prelude](prelude.md) for details.

We use `panic!` (and `unwrap` and slice indexing similar standard Rust operations) to indicate conditions that should always hold; if execution ever panics, that is a bug in the specification.
### Non-determinism

We also need one language feature that Rust does not have direct support for: non-determinism.
The return type `Nondet<T>` indicates that this function picks the returned `T` (and also its effect on `&mut` it has access to) *non-deterministically*
This is only used in the memory model; the expression language is perfectly deterministic (but of course it has to propagate the memory model non-determinism).
In a function returning `Nondet`, we use `?` for monadic bind (this is more general than its usual role for short-circuiting), and the return value is implicitly wrapped in monadic return.

We also need one language feature that Rust does not have direct support for: functions returning `Result` can exhibit non-determinism.
(If you are a monad kind of person, think of `Result` as also containing the non-determinism monad, not just the error monad.)
This is only used in the memory model; the expression language is perfectly deterministic.
The function `pick<T>(fn(T) -> bool) -> Result<T>` will return a value of type `T` such that the given closure returns `true` for this value.
The function `pick<T>(fn(T) -> bool) -> Nondet<T>` will return a value of type `T` such that the given closure returns `true` for this value.
This non-determinism is interpreted *daemonically*, which means that the program has to be correct for every possible choice.
In particular, if the closure is `|_| false` or `T` is uninhabited, then this corresponds to "no behavior" (which is basically the perfect opposite of Undefined Behavior, and also very confusing).
Similarly, the function `predict<T>(fn(T) -> bool) -> Result<T>` also returns a `T` satisfying the closure, but this non-determinism is interpreted *angelically*, which means there has to *exist* a possible choice that makes the program behave as intended.
Similarly, the function `predict<T>(fn(T) -> bool) -> Nondet<T>` also returns a `T` satisfying the closure, but this non-determinism is interpreted *angelically*, which means there has to *exist* a possible choice that makes the program behave as intended.
In particular, if the closure is `|_| false` or `T` is uninhabited, then this operation is exactly the same as `hint::unreachable_unchecked()`: no possible choice exists, and hence ever reaching this operation is Undefined Behavior.

The combined monad `Nondet<Result<T>>` is abbreviated `NdResult<T>`, and in such a function `?` can also be used on computations that only need one of the monads, applying suitable lifting:
`Result<U> -> NdResult<U>` is trivial (just use monadic return of `Nondet`); `Nondet<U>` -> `NdResult<U>` simply maps `Ok` over the inner computation.

## Status

MiniRust is extremely incomplete!
Expand Down
20 changes: 10 additions & 10 deletions lang/operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,22 +6,22 @@ Here we define the part of the [`step` function](step.md) that is concerned with

```rust
impl Machine {
fn eval_un_op(&mut self, operator: UnOp, operand: Value) -> Result<Value>;
fn eval_un_op(&mut self, operator: UnOp, operand: Value) -> NdResult<Value>;
}
```

### Integer operations

```rust
impl Machine {
fn eval_un_op_int(&mut self, op: UnOpInt, operand: BigInt) -> Result<BigInt> {
fn eval_un_op_int(&mut self, op: UnOpInt, operand: BigInt) -> NdResult<BigInt> {
use UnOpInt::*;
match op {
Neg => -operand,
Cast => operand,
}
}
fn eval_un_op(&mut self, Int(op, int_type): UnOp, operand: Value) -> Result<Value> {
fn eval_un_op(&mut self, Int(op, int_type): UnOp, operand: Value) -> NdResult<Value> {
let Value::Int(operand) = operand else { panic!("non-integer input to integer operation") };

// Perform the operation.
Expand All @@ -37,12 +37,12 @@ impl Machine {

```rust
impl Machine {
fn eval_un_op(&mut self, Ptr2Int: UnOp, operand: Value) -> Result<Value> {
fn eval_un_op(&mut self, Ptr2Int: UnOp, operand: Value) -> NdResult<Value> {
let Value::Ptr(ptr) = operand else { panic!("non-pointer input to ptr2int cast") };
let result = self.intptrcast.ptr2int(ptr)?;
Value::Int(result)
}
fn eval_un_op(&mut self, Int2Ptr: UnOp, operand: Value) -> Result<Value> {
fn eval_un_op(&mut self, Int2Ptr: UnOp, operand: Value) -> NdResult<Value> {
let Value::Int(addr) = operand else { panic!("non-integer input to int2ptr cast") };
let result = self.intptrcast.int2ptr(addr)?;
Value::Ptr(result)
Expand All @@ -54,15 +54,15 @@ impl Machine {

```rust
impl Machine {
fn eval_bin_op(&mut self, operator: BinOp, left: Value, right: Value) -> Result<Value>;
fn eval_bin_op(&mut self, operator: BinOp, left: Value, right: Value) -> NdResult<Value>;
}
```

### Integer operations

```rust
impl Machine {
fn eval_bin_op_int(&mut self, op: BinOpInt, left: BigInt, right: BigInt) -> Result<BigInt> {
fn eval_bin_op_int(&mut self, op: BinOpInt, left: BigInt, right: BigInt) -> NdResult<BigInt> {
use BinOpInt::*;
match op {
Add => left + right,
Expand All @@ -76,7 +76,7 @@ impl Machine {
}
}
}
fn eval_bin_op(&mut self, Int(op, int_type): BinOp, left: Value, right: Value) -> Result<Value> {
fn eval_bin_op(&mut self, Int(op, int_type): BinOp, left: Value, right: Value) -> NdResult<Value> {
let Value::Int(left) = left else { panic!("non-integer input to integer operation") };
let Value::Int(right) = right else { panic!("non-integer input to integer operation") };

Expand All @@ -103,7 +103,7 @@ impl Machine {

/// Perform in-bounds arithmetic on the given pointer. This must not wrap,
/// and the offset must stay in bounds of a single allocation.
fn ptr_offset_inbounds(&self, ptr: Pointer, offset: BigInt) -> Result<Pointer> {
fn ptr_offset_inbounds(&self, ptr: Pointer, offset: BigInt) -> NdResult<Pointer> {
if !offset.in_bounds(Signed, PTR_SIZE) {
throw_ub!("inbounds offset does not fit into `isize`"):
}
Expand All @@ -129,7 +129,7 @@ impl Machine {
new_ptr
}

fn eval_bin_op(&mut self, PtrOffset { inbounds }: BinOp, left: Value, right: Value) -> Result<Value> {
fn eval_bin_op(&mut self, PtrOffset { inbounds }: BinOp, left: Value, right: Value) -> NdResult<Value> {
let Value::Ptr(left) = left else { panic!("non-pointer left input to pointer addition") };
let Value::Int(right) = right else { panic!("non-integer right input to pointer addition") };

Expand Down
44 changes: 22 additions & 22 deletions lang/step.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ For statements it also advances the program counter.
```rust
impl Machine {
/// To run a MiniRust program, call this in a loop until it throws an `Err` (UB or termination).
fn step(&mut self) -> Result {
fn step(&mut self) -> NdResult {
let frame = self.cur_frame_mut();
let (next_block, next_stmt) = &mut frame.next;
let block = &frame.func.blocks[next_block];
Expand All @@ -40,7 +40,7 @@ This section defines the following function:

```rust
impl Machine {
fn eval_value(&mut self, val: ValueExpr) -> Result<Value>;
fn eval_value(&mut self, val: ValueExpr) -> NdResult<Value>;
}
```

Expand All @@ -50,7 +50,7 @@ Constants are trivial, as one would hope.

```rust
impl Machine {
fn eval_value(&mut self, Constant(value, _type): ValueExpr) -> Result<Value> {
fn eval_value(&mut self, Constant(value, _type): ValueExpr) -> NdResult<Value> {
value
}
}
Expand All @@ -62,7 +62,7 @@ This loads a value from a place (often called "place-to-value coercion").

```rust
impl Machine {
fn eval_value(&mut self, Load { destructive, source }: ValueExpr) -> Result<Value> {
fn eval_value(&mut self, Load { destructive, source }: ValueExpr) -> NdResult<Value> {
let p = self.eval_place(source)?;
let ptype = source.check(self.cur_frame().func.locals).unwrap();
let v = self.mem.typed_load(p, ptype)?;
Expand All @@ -81,12 +81,12 @@ The `&` operators simply converts a place to the pointer it denotes.

```rust
impl Machine {
fn eval_value(&mut self, AddrOf { target }: ValueExpr) -> Result<Value> {
fn eval_value(&mut self, AddrOf { target }: ValueExpr) -> NdResult<Value> {
let p = self.eval_place(target)?;
Value::Ptr(p)
}

fn eval_value(&mut self, Ref { target, align, .. }: ValueExpr) -> Result<Value> {
fn eval_value(&mut self, Ref { target, align, .. }: ValueExpr) -> NdResult<Value> {
let p = self.eval_place(target)?;
let ptype = target.check(self.cur_frame().func.locals).unwrap();
// We need a check here, to ensure that encoding this value at the given type is valid.
Expand All @@ -106,11 +106,11 @@ The functions `eval_un_op` and `eval_bin_op` are defined in [a separate file](op

```rust
impl Machine {
fn eval_value(&mut self, UnOp { operator, operand }: ValueExpr) -> Result<Value> {
fn eval_value(&mut self, UnOp { operator, operand }: ValueExpr) -> NdResult<Value> {
let operand = self.eval_value(operand)?;
self.eval_un_op(operator, operand)?
}
fn eval_value(&mut self, BinOp { operator, left, right }: ValueExpr) -> Result<Value> {
fn eval_value(&mut self, BinOp { operator, left, right }: ValueExpr) -> NdResult<Value> {
let left = self.eval_value(left)?;
let right = self.eval_value(right)?;
self.eval_bin_op(operator, left, right)?
Expand All @@ -128,7 +128,7 @@ Place evaluation ensures that this pointer is always dereferenceable (for the ty
type Place = Pointer;

impl Machine {
fn eval_place(&mut self, place: PlaceExpr) -> Result<Place>;
fn eval_place(&mut self, place: PlaceExpr) -> NdResult<Place>;
}
```

Expand All @@ -138,7 +138,7 @@ The place for a local is directly given by the stack frame.

```rust
impl Machine {
fn eval_place(&mut self, Local(name): PlaceExpr) -> Result<Place> {
fn eval_place(&mut self, Local(name): PlaceExpr) -> NdResult<Place> {
// This implicitly asserts that the local is live!
self.cur_frame().locals[name]
}
Expand All @@ -156,7 +156,7 @@ It also ensures that the pointer is dereferenceable.

```rust
impl Machine {
fn eval_place(&mut self, expr @ Deref { operand, .. }: PlaceExpr) -> Result<Place> {
fn eval_place(&mut self, expr @ Deref { operand, .. }: PlaceExpr) -> NdResult<Place> {
let Value::Ptr(p) = self.eval_value(operand)? else {
panic!("dereferencing a non-pointer")
};
Expand All @@ -169,7 +169,7 @@ impl Machine {

```rust
impl Machine {
fn eval_place(&mut self, Field { root, field }: PlaceExpr) -> Result<Place> {
fn eval_place(&mut self, Field { root, field }: PlaceExpr) -> NdResult<Place> {
let type = root.check(self.cur_frame().func.locals).unwrap().type;
let root = self.eval_place(root)?;
let offset = match type {
Expand All @@ -181,7 +181,7 @@ impl Machine {
self.ptr_offset_inbounds(root, offset.bytes())?
}

fn eval_place(&mut self, Index { root, index }: PlaceExpr) -> Result<Place> {
fn eval_place(&mut self, Index { root, index }: PlaceExpr) -> NdResult<Place> {
let type = root.check(self.cur_frame().func.locals).unwrap().type;
let root = self.eval_place(root)?;
let Value::Int(index) = self.eval_value(index)? else {
Expand Down Expand Up @@ -227,7 +227,7 @@ Assignment evaluates its two operands, and then stores the value into the destin

```rust
impl Machine {
fn eval_statement(&mut self, Assign { destination, source }: Statement) -> Result {
fn eval_statement(&mut self, Assign { destination, source }: Statement) -> NdResult {
let place = self.eval_place(destination)?;
let val = self.eval_value(source)?;
let ptype = place.check(self.cur_frame().func.locals).unwrap();
Expand All @@ -248,7 +248,7 @@ This is equivalent to the assignment `_ = place`.

```rust
impl Machine {
fn eval_statement(&mut self, Finalize { place }: Statement) -> Result {
fn eval_statement(&mut self, Finalize { place }: Statement) -> NdResult {
let p = self.eval_place(place)?;
let ptype = place.check(self.cur_frame().func.locals).unwrap();
let _val = self.mem.typed_load(p, ptype)?;
Expand All @@ -262,14 +262,14 @@ These operations (de)allocate the memory backing a local.

```rust
impl Machine {
fn eval_statement(&mut self, StorageLive(local): Statement) -> Result {
fn eval_statement(&mut self, StorageLive(local): Statement) -> NdResult {
// Here we make it a spec bug to ever mark an already live local as live.
let layout = self.cur_frame().func.locals[local].layout();
let p = self.mem.allocate(layout.size, layout.align)?;
self.cur_frame_mut().locals.try_insert(local, p).unwrap();
}

fn eval_statement(&mut self, StorageDead(local): Statement) -> Result {
fn eval_statement(&mut self, StorageDead(local): Statement) -> NdResult {
// Here we make it a spec bug to ever mark an already dead local as dead.
let layout = self.cur_frame().func.locals[local].layout();
let p = self.cur_frame_mut().locals.remove(local).unwrap();
Expand All @@ -292,7 +292,7 @@ The simplest terminator: jump to the (beginning of the) given block.

```rust
impl Machine {
fn eval_terminator(&mut self, Goto(block_name): Terminator) -> Result {
fn eval_terminator(&mut self, Goto(block_name): Terminator) -> NdResult {
self.cur_frame_mut().next = (block_name, 0);
}
}
Expand All @@ -302,7 +302,7 @@ impl Machine {

```rust
impl Machine {
fn eval_terminator(&mut self, If { condition, then_block, else_block }: Terminator) -> Result {
fn eval_terminator(&mut self, If { condition, then_block, else_block }: Terminator) -> NdResult {
let Value::Bool(b) = self.eval_value(condition)? else {
panic!("if on a non-boolean")
};
Expand All @@ -316,7 +316,7 @@ impl Machine {

```rust
impl Machine {
fn eval_terminator(&mut self, Unreachable: Terminator) -> Result {
fn eval_terminator(&mut self, Unreachable: Terminator) -> NdResult {
throw_ub!("reached unreachable code");
}
}
Expand All @@ -334,7 +334,7 @@ impl Machine {
fn eval_terminator(
&mut self,
Call { callee, arguments, ret, next_block }: Terminator
) -> Result {
) -> NdResult {
let Some(func) = self.prog.functions.get(callee) else {
throw_ub!("calling non-existing function");
};
Expand Down Expand Up @@ -400,7 +400,7 @@ The callee should probably start with a bunch of `Finalize` statements to ensure

```rust
impl Machine {
fn eval_terminator(&mut self, Return: Terminator) -> Result {
fn eval_terminator(&mut self, Return: Terminator) -> NdResult {
let frame = self.stack.pop().unwrap();
let func = frame.func;
// Copy return value to where the caller wants it.
Expand Down
2 changes: 1 addition & 1 deletion mem/basic.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ Then we implement creating and removing allocations.

```rust
impl MemoryInterface for BasicMemory {
fn allocate(&mut self, size: Size, align: Align) -> Result<Pointer<AllocId>> {
fn allocate(&mut self, size: Size, align: Align) -> NdResult<Pointer<AllocId>> {
// Reject too large allocations. Size must fit in `isize`.
if !size.in_bounds(Signed, PTR_SIZE) {
throw_ub!("asking for a too large allocation");
Expand Down
2 changes: 1 addition & 1 deletion mem/interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ trait MemoryInterface {

/// Create a new allocation.
/// The initial contents of the allocation are `AbstractByte::Uninit`.
fn allocate(&mut self, size: Size, align: Align) -> Result<Self::Pointer>;
fn allocate(&mut self, size: Size, align: Align) -> NdResult<Self::Pointer>;

/// Remove an allocation.
fn deallocate(&mut self, ptr: Self::Pointer, size: Size, align: Align) -> Result;
Expand Down
2 changes: 1 addition & 1 deletion mem/intptrcast.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ impl IntPtrCast<Provenance> {
ptr.addr
}

fn int2ptr(&mut self, addr: BigInt) -> Result<Pointer<Provenance>> {
fn int2ptr(&mut self, addr: BigInt) -> NdResult<Pointer<Provenance>> {
// Predict a suitable provenance. It must be either `None` or already exposed.
let provenance = predict(|prov: Option<Provenance>| {
prov.map_or(
Expand Down
4 changes: 4 additions & 0 deletions prelude.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ macro_rules! throw_machine_stop {
($($tt:tt)*) => { do yeet TerminationInfo::MachineStop(format!($($tt)*)) };
}

/// We leave the encoding of the non-determinism monad opaque.
type Nondet<T=()>;
type NdResult<T=()> = Nondet<Result<T>>;

/// Basically copies of the `Size` and `Align` types in the Rust compiler.
/// See <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_target/abi/struct.Size.html>
/// and <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_target/abi/struct.Align.html>.
Expand Down

0 comments on commit 36c2666

Please sign in to comment.