Skip to content

Optimize quotient decomposition and coset LDEs #950

@adr1anh

Description

@adr1anh

Right now, the prover decomposes the quotient as follows

  1. Reshape the quotient into D column
  2. Perform an iDFT
  3. Manually perform a batch coset shift
  4. Compute the DFT to obtain the evaluations over the LDE domain

Unfortunately, the DFT interface takes as input and returns matrices in natural order, which means that the internal implementations perform redundant bit-reversals. We could instead preserve the bit-reversed order between the two DFT calls, and only bit-reverse the row_base, which would be slightly more efficient. This may require modifications to Plonky3's TwoAdicSubgroupDft crate though.

Moreover, the current coset LDE implementations used for trace commitments handle coset shifts sequentially by iterating over powers of the shift. This could be parallelized by allocating the vector of powers using shift.powers which is optimized to use parallelism and packing, and scaling the rows in parallel.


Ported from 0xMiden/p3-miden#28

Metadata

Metadata

Assignees

No one assigned

    Labels

    starksRelated to Plonky3 migration

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions