Demystifying fixed-point differentiation

The ‘magic sauce’ that allows neos to differentiate through the fitting process is based on an implementation of fixed-point differentiation. Αs I understand it, the gist of how this works is that if a function has a fixed point, i.e. f(x) = x for some x (e.g. a `minimize(F, x_init)` routine evaluated at `x_init` = minimum of `F`), then one can evaluate the gradients through a second pass of the function, evaluated close to the fixed point.

It would be nice to consolidate some thoughts (perhaps in a notebook) on the technical details for those interested. The specific algorithm used in neos can be found in section 2.3 of [this paper](https://www.researchgate.net/publication/323176030_ADJOINTS_OF_FIXED-POINT_ITERATIONS) (two-phase method).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Demystifying fixed-point differentiation #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Demystifying fixed-point differentiation #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions