Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Buses #8

Closed
wants to merge 43 commits into from
Closed

Conversation

Leo-Besancon
Copy link
Collaborator

@Leo-Besancon Leo-Besancon commented Feb 6, 2025

This PR aims to implement Buses in AirScript, with a design based on the following issue: 0xPolygonMiden#183

It currently relies on this PR being merged: 0xPolygonMiden#359

Current status:

  • We've updated the parser to handle:
    • bus declarations Bus {span, bus_type}
    • bus boundary constraints, which are parsed in a similar way to other constraints ( enf <SymbolAccess>.boundary = ScalarExpr )
    • bus integrity constraints BusOperation {bus, bus_operator, args}, which are parsed as ListComprehensions to have an easy access to a selector/multiplicity,
      similar to how conditional constraints are handled
      As a result, the AST can parse, analyse semantically and have constant propagation done for the following tests, ignored as we don't handle them in the MIR: https://github.com/massalabs/air-script/blob/feat-implement-bus/mir/src/tests/buses.rs
  • We're currently adding nodes to the MIR (e.g. BusOperation) to represent these operations

We have a few questions:

  1. Does this parsing make sence to you (for instance, reusing ListComprehensions)?
  2. How should we handle boundary constraints on non-empty buses? Do they always reference public inputs? Can they be something else (constants, ...), and is there a distinction between first and last boundaries? I'm thinking about this comment: # TODO: we still need to figure out how to initialize the bus to non-empty values here: Permutation check syntax 0xPolygonMiden/air-script#183 (comment)
  3. Regarding the backend design choice (representing bus constraints in Air Vs reducing them to constraints in the algebraic graph), did you have more thoughts on what is wanted?
  4. According to the same comment, it seems that 3. relies on removing the aux segment and random values from the front-end, should we work on this first?

@bobbinth
Copy link
Collaborator

Here are my answers to the last 3 questions. Also tagging @Al-Kindi-0, @plafer, @paracetamolo in case I missed anything.

2. How should we handle boundary constraints on non-empty buses? Do they always reference public inputs? Can they be something else (constants, ...)

I think boundary constraints are either empty or they alway reference public inputs - I can't think of a way to initialize them to something that is just a constant in a meaningful way.

is there a distinction between first and last boundaries?

Yes, first and last mean different things and the meaning is similar to the meaning these keywords have for regular boundary constraints. One way to thin of it is that boundary constraints on a bus define initial and final values for the auxiliary column which defines the bus.

The way to reference public inputs could be as simple as:

public_inputs {
    x: [[2]],
    y: [[3]],
}

buses {
    unit p,
}

boundary_constraints {
    enf p.first = x;
    enf p.last = y;
}

This would resolve to the first value of column $p$ being constrained by:

$$ \prod_{i=0}^n (\alpha_0 + \alpha_1 \cdot x_{i,0} + \alpha_2 \cdot x_{i, 1}) $$

Where the $i$ is the number of rows in the table $x$.

And the last value of $p$ constrained by:

$$ \prod_{j=0}^n (\alpha_0 + \alpha_1 \cdot y_{j,0} + \alpha_2 \cdot y_{j, 1} + \alpha_3 \cdot y_{j, 2}) $$

Where the $j$ is the number of rows in the table $y$.

Note that if $p$ was a LogUp-based bus, the expressions would be different - they would be sums of fractions rather than products.

  • Regarding the backend design choice (representing bus constraints in Air Vs reducing them to constraints in the algebraic graph), did you have more thoughts on what is wanted?

There are pros and cons to each. On the one hand, having all constraints described uniformly in the algebraic graph seems like a good idea. On the other hand, if the algebraic graph can be simplified significantly be making it work only over the base field constraints - this would be a good idea as well. I think the question here is what would be easer to implement initially. We can start there and then improve the code after that (my hunch is putting all constraints into the algebraic graph would probably be that).

  • According to the same comment, it seems that 3. relies on removing the aux segment and random values from the front-end, should we work on this first?

I think this would need to be done regardless because once the busses are introduced I see relatively little utility to exposing auxiliary segment constraints to the end users. But this could also be done as a follow-up PR.

@Leo-Besancon
Copy link
Collaborator Author

Leo-Besancon commented Feb 19, 2025

Thank you @bobbinth, it's very clear! It seems the easiest for now is to introduce a small pass before lowering the MIR into the AIR to expand the bus constraints, so we've started implementing this.

public_inputs {
    x: [[2]],
    y: [[3]],
}

A follow-up question on public inputs, currently public inputs can only be fixed-size vectors (x: [12] for a vector of 12 Felts).
Should we allow them to be fixed-size matrices x: [[2]; 4] (or x: [[2; 4]] or similar) for a table of 2 columns and 4 rows? Knowing the full input size would probably be required to include these constraints in the graph.

Or did you have something else in mind with the notation x: [[2]]? If you were thinking of having dynamically sized public inputs, we'll probably need to have during codegen a step to compute the value depending on the provided input, and inject the result into the graph if that's possible, don't you think so?

@bobbinth
Copy link
Collaborator

Or did you have something else in mind with the notation x: [[2]]?

The idea here was this would be a table with a fixed number of columns (i.e., 2 columns, in this example) but dynamic number of rows.

If you were thinking of having dynamically sized public inputs, we'll probably need to have during codegen a step to compute the value depending on the provided input, and inject the result into the graph if that's possible, don't you think so?

Ah yes - good point! We'd have a static equation for "collapsing" each row into a single value, but we'd need to have a dynamic expression to "collapse" all row values into a single value. This dynamic expression cannot go into the algebraic graph and so would need to be kept and processed separately. The approach could be the Bus struct I described in 0xPolygonMiden#183 (comment).

The codegen for this would need to output the "table collapsing" logic which the verifier would execute to get a single value, and then this value would become the boundary constraint for the bus column (we can do this in a separate PR).

@Leo-Besancon
Copy link
Collaborator Author

Leo-Besancon commented Feb 20, 2025

Sounds good!

Here is an updated status checklist for this PR:

  • Parse bus related expressions (AST)
  • Handle semantic analysis and constant propagation for buses (AST)
  • In a new MIR > MIR pass, before lowering to the AIR, expand bus integrity constraints and handle empty bus boundary constraints
  • Add tests (in parser, Mir, Air, and Codegen) for new bus expressions
  • Study the bus codegen tests (Winterfell and Masm) output, check that the resulting constraints are expected
  • Add documentation on buses

In follow-up PRs, we should handle:

@Leo-Besancon
Copy link
Collaborator Author

Closed in favor of 0xPolygonMiden#371

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants