Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking PR for v0.4 release #343

Draft
wants to merge 93 commits into
base: main
Choose a base branch
from
Draft

Tracking PR for v0.4 release #343

wants to merge 93 commits into from

Conversation

bitwalker and others added 30 commits July 23, 2023 03:43
This fixes an issue which caused expansion of a folded comprehension
within the context of a constraint to be improperly treated as a
constraint comprehension. This expansion occurs during the inlining
phase.

This bug occurred because a flag determines whether a given
comprehension is a constraint comprehension or a regular list
comprehension, and is set to true when expanding a constraint.
However, while expanding a constraint, we may encounter a call to a
function, such as one of the list folding builtins, in which case
a comprehension passed as an argument to those builtins is not
expressing a constraint, so the flag should be false while expanding
it.

We already manage this flag correctly in other contexts, but this one
was missed during development.

Fixes #340
This commit extends the syntax and semantics for functions in AirScript:

* You can now define functions that take and produce values, using the
  `fn` keyword, see below for an example.
* You can now call value-producing functions in expresssion position,
  i.e. as operands to a binary operator, as an iterable for use by
  comprehensions, etc.
* Functions are strongly typed, and are type checked to ensure that both
  the function itself, and all call sites, are correctly typed.

Example:

```
fn madd3(a: felt[3], b: felt) -> felt {
    let d = [c * b for c in a]
    return sum(d)
}
```

In this example:

* Two parameters are bound, `a` and `b`, a vector of 3 felts, and a
  felt, respectively
* The return type is declared to be felt
* The body of the function can be arbitrarily complex, i.e. you can
  define variables, comprehensions, etc.
* Not illustrated here, but all of the usual global bindings (i.e. trace
  columns, public inputs, random values, etc.) are in scope and can be
  referenced.

Things you cannot do with functions:

* A function _must_ return a value, i.e. it cannot have an empty body
* A function _may not_ contain constraints
* You may call functions within a function, but recursion is not
  supported, i.e. an error will be raised if a call is made to a
  function which is already on the call stack
This commit does two things:

1. Modifies the AST to allow `let` in expression position (scalar or
   otherwise).
2. Refactors the constant propagation and inlining passes to properly
   handle occurrances of `let` in expression position, and make use of
   this new capability to simplify inlining of certain syntax nodes.

In particular, the inliner now makes liberal use of this new flexibility
in the AST, in order to expand syntax nodes in expression position. Such
nodes, with the introduction of functions, can have arbitrarily complex
expansions, and with this change those expansions can now be done in-place,
rather than attempting to "lift" expressions that may produce block-like
expansions into the nearest containing statement block, requiring expensive
let-tree rewrites.

In fact, it became clear during the testing and implementation of
functions that without the ability to expand the syntax tree in this
manner, it would be virtually impossible to correctly inline a full
AirScript program. For example, consider the following:

```
trace_columns {
    main: [clk, a, b[4]]
}

fn fold_vec(a: felt[4]) -> felt {
    let m = a[0] * a[1]
    let n = m * a[2]
    let o = n * a[3]
    return o
}

integrity_constraints {
    let o = a
    let m = fold_vec(b)
    enf o = m
}
```

After inlining (the old way), we would get (commentary inline):

```
integrity_constraints {
    # This binding will be shadowed by the binding of the same name from
    # the inlined body of `fold_vec`
    let o = a

    # This `m` is the one inlined from `fold_vec`
    let m = b[0] * b[1]
    let n = m * b[2]
    # This `o` is the one inlined from `fold_vec`, and now shadows the `o`
    # bound in the original `integrity_constraints`
    let o = n * b[3]

    # The inliner moved the original binding of `m` into the innermost
    # `let` body, so that it can bind the value "returned" from
    # `fold_vec`, as expected. Because of this, it shadows the `m` that
    # was inlined from `fold_vec`, and no one is the wiser because the
    # semantics of the original code are preserved
    let m = o
    # This constraint is now incorrect, as the binding, `o`, we intended
    # to constrain has been shadowed by a different `o`.
    enf o = m
}
```

To summarize, the original inliner needed to split the current block at
the statement being expanded/inlined, and move code in two directions:
"lifting" inlined statements into the current block before the split,
and "lowering" the statements after the split by placing them in the
innermost `let` block. This was necessary so that the result of the
inlined function (or more generally, any expression with a block-like
expansion, e.g. comprehensions) could be bound to the name used in the
original source code, with all of the necessary bindings in scope so
that the expression that was bound would correctly evaluate during
codegen.

As we can see, the result of this is that an expanded/inlined syntax node
could introduce bindings that would shadow other bindings in scope, and
change the behavior of the resulting program (as demonstrated above).

This commit allows bindings to be introduced anywhere that an expression
is valid. This has the effect of no longer requiring code motion just to
support let-bound variables in an inlined/expanded expression. This
simplifies the inliner quite a bit.
This makes it possible to reference named constants in a range, or slice
indexing operation.

NOTE: The named constants are rewritten with the values they refer to
early during compilation, so in most of the pipeline you will never be
able to observe anything but constant range bounds. However, rather than
complicate the AST further by making that distinction explicit, we reuse
a single type for all ranges, which means we have a lot of
apparently-fallible operations in places where they are actually
infallible, but this seems like a decent compromise.
MassaLabs: add comma in match/case statement
…tation

MassaLabs: Allow named constants range notation
Translation from AST has been vastly commented out,
to rework
Soulthym and others added 30 commits November 14, 2024 10:15
implement Enf, Call, Fold, For, If, Vector, Matrix, Boundary, Parameter
added Eq support to operation
few bugfixes
rename NodeTypes to NodeType
split NodeType into LeafNode, RootNode, MiddleNode
removed debugging
new_i32 replaced by generic new_value
LeafNode::I32 renamed to Value
implemented Felt Leaf node
improved conversions
removed debugging
removed MiddleNode::Node
renamed new_body -> new_scope
implement Default
* swap-ir-impl starting point

* use absolute paths in IsNode derive macro

* binary_op: move Add, implement Mul and Sub

* fix IsNode derive macro module name

* fix name conflict with rust keywords in IsNode derive macro

* blocks: Function, Evaluator, If, For, Fold

* structured_op: moved Fold, Call

* Implement Visitor on new structure

* Add some fields to Graph to match previous API

* IsNode derive macro: support enums

* unary_ops: Boundary and Enf
IsNode derive macro: support extra fields

* IsLeaf derive macro for enums

* leaf nodes: SpannedMirVariable + Parameter

* aggregated_op: Vector and Matrix

* represent Matrix as Node of Vectors

* First steps on updating pretty printer

* Start introducing new node structure in Mir

* Update passes

* Implement Unrolling (needs Hash & Option/Vecs)

* Implement Lowering MIR > AIR

* Avoid referencing a func/ev that has not been defined

+ cargo fmt

* Update Unrolling and Inlining with helper

* Comment out Old passes

* Cleaned up a bit

* Add Access node usage to MIR

* bugfix

* Finished duplicate_node helper

* Replace Call with last expr of the body

* rename Access => Accessor

* Fix Accessor unrolling

* cargo fmt

* Small bugfixes in Unrolling for Accessor

* Update passes try ir3 (#5)

* derive Hash for SpannedMirValue

* ir3: better node splitting, no macro, simpler api
missimg builder pattern

* replaced Link<Op> by Link<Owner> for Operations and Leaves

* added IndexAccess Operand

* renamed IndexAccess to Accessor
re use parser's AccessType
impl Hash for AccessType

* fix call being copied instead of pointed to by get_children()

* typesafe builder pattern for most ops. Misssing structured ops

* type-safe Builder pattern for structured ops

* MirGraph Builder editing

* Add Leaves to Op enum

* ir3: comment top level items

* isolate Leaf, Op, Owner, Root

* add converters to enums

* added converters to structs
renamed IndexAccess to Acccessor

* Initial merge ir3 > update_passes

* remove comments

* Update mod.rs

* Graph: implement public api

* Swapped in helpers of passes/mod.rs

* Node: all node types + converters

* Full inlining with Node + cargo fmt

* Remove Link for FoldOperator

* Swap unrolling

* Swap lowering Mir -> Air

* Update visitor (with suboptimal get_children() herlper, to refactor)

* Add MirType to Parameter

* Update inlining to handle evaluators

* most of translated adapted

* add type support for Parameter in translate

* in build_statement, handle Owner::None (integrity or boundary constraint)

---------

Co-authored-by: Thybault Alabarbe <[email protected]>

* Cargo format and make Lowering Mir > Air build

* Clean both pipelines, removed unused structures

* removed: unused derive_graph, allow unused

* accept non-vectors as For.iterators

* Fix ir translate and passes (#7)

* fix translate.rs not translating bodies

* expand fn and ev args, missing scoping

* expand all arguments in function and evaluators

* unpack vec in all cases

* lookup arguments in the access_map if not found in bindings

* stop unpacking call arguments, add Vector<Params> to bindings

* insert accessors to bound Vector<Param>

* fix Value Builder types

* insert links in enums

* added missing Child/Parent traits

* Remove fn edit(self) on builders

* Make translate_from_mir.rs compile

* fix double borrow

* Handle multiple parents

* remove some warns

* Adapt Visitor and Inlining

* Update inlining and visitor

* Fix translate of functions

* visitor with pre-scan

* Add unrolling2 + fmt

* Add comments to passes

* Add comment to translate.rs pass

* Update Inlining and Unrolling passes

Stilll some debug todo

* Cargo fmt

* Add TODOs

* Builder derive macro

* fix Builder derive macro compilation

* Builder derive macro: fix Default recursion

* Builder derive macro: fix mutability on non-link, fix transistion
derive for Function

* Builder derive macro: handle Vec<BackLink> + impl Sub

* Builder derive macro: docs & fix + test >2 required fields case

* Builder derive macro: derive on all structs

* remived unused buggy api to remoce_child

* mutability examples and potention solutions:
one bug left

* double borrow bug remaining

* fix test_mutability_wrap_enum: missing clone

* Fix doouble borrow in Link::update

* fix singleton updates in final design

* cleanup intermediary versions

* fix module name

* Translate - Split function / ev parameter translation

* Fix for_node children, don't put None selector

* Fix inlining - Needs mutability

* Cleanup unrolling

* cargo fmt

* fix warns

* Improve comments

* Improve context setting in Unrolling

* refactor checkpoint: wrap structs in Op/Root enums

* Fix let translation

* Add prints to unrolling

* Add diagnostics to translate

* swap with ir + fix Builder enumwrapper
checkpoint: does not compile

* Setup diagnostics for inlining and unrolling

* converer bug: return from local reference

* avoid double option in BackLink::to_link()

* wrap all structs in enum:
checkpoint: compiles, 81 tests pass

* revert examples

* fix missing ;

* fix compilation: 43 tests pass

* Remove warns, fix translate_from_mir.rs

* Fix Inlining

* Fix inlining after git merge inconsistency

* Have parameter reference ref_node, and better handle evaluator params / args

* Update inlining2.rs

* Only inline Params that target the ref_node we aim

* Improve diags on Evaluator args and params mismatch

* Improve diags again

* Reworked visit_if to work with selectors, kept visit_if_old

* cargo clippy

* Clippy fix

* Clippy fix and remove prints

* cargo fmt

* cargo clippy + fmt

* Remove prints

* Builder: ignore fields that start with underscore

* singleton converters

* patch link

* api for shared mutability

* transform BackLink to Link when comparing and hashing singlton enum
wrappers

* swap *obj.borrow_mut() = value to obj.set(value)

* use ptr as hashmap key to preserve 1 key per instance
restore PartialEq Invariant for HashMap key

* use pointer as HashMap keys

* Some fixes for translate + inlining

* fix BackLink fields' mutability in Builder derive macro

* cargo fmt

* fix Parameter.ref_node cyclic reference + restric to Owner + compare via get_ptr

fix Parameter.ref_node cyclic reference:
Parameter.ref_Node used to store a Link to its parent which caused
PartialEq and Hash to loop.
Replaced with a BackLink
 + restricted the field from Node -> Owner
 + expose inner get_ptr on Owner
 + comparison and hash via get_ptr

* Update translate, mod and inlining

* Fix nested evaluators

* fix Parameter PartialEq and Hash to work on disconnected identical graphs

* add debugging method to Op/Root + Link/BackLink

* filter stale node wrappers in visitor

* fix edgecase in `Link<Op>::set`

* manual pointer review checks to verify Op::set logic

* fix parameters

* Fix let translation

* Debug air-ir tests, begin codegen tests

* Set program name in Mir

* Fix mutability test

---------

Co-authored-by: Thybault Alabarbe <[email protected]>

* cargo fmt

* removed unused files

* rename passes files

* Cargo check fixes

* cargo clippy fixes

* Improve Accessor of trace columns handling

* Add Mir Exp Node, fix Accessor logic

* Update expected codegen results (equivalent, but less optimized)

* Fix functions_complex and evaluator codegen tests

* fix uneeded unwrap

* Make test pass

* cargo clippy fixes

* cargo fmt

* Defaults to Mir pipeline in CLI, expose CLI arg to bypass Mir

* removed redundant ignored tests

* derive Spanned on Evaluator, Function, Parameter, and For

* derive Span for Root

* chore: refactor and document inlining, translate, pass helpers

* derive Hash for Accessor

* Derive Spanned for all node types and wrappers

* Propagate span + cargo fmt

* unrolling documentation + cargo check fixes

* Add translate_from_mir documentation

* mirror PartalEq field usage in Hash to restore Hash invariant

* cargo clippy fixes

---------

Co-authored-by: Leo-Besancon <[email protected]>
* document derive-ir helpers

* document Visitor

* document Link and BackLink

* Small tweaks and fix typos in documentation for MirGraph, Accessor

* documentation for Root, Evaluator, Function and Parameter

* documentation for Op

* cargo fmt + clippy

* documentation for get_inner, get_inner_mut helpers

* documentation for Op::debug and tweak docs

* documentation tweaks for Root

* documentation for Node

* documentation tweaks

* documentation for Owner + implement missing as_node

* documentation for Child and Parent methods

* documentation tweaks for Builder

* documentation tweaks for Value

* refactor: Move parameter in ops module

---------

Co-authored-by: Thybault Alabarbe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants