Skip to content

Prototype support for async native functions #4237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rib
Copy link

@rib rib commented May 9, 2025

Based on some discussion in #3442 (comment); this experiments with enabling support for async NativeFunctions that are only async from the pov of the host, and appear as synchronous from within JavaScript.

Instead of running the async functions as a Promise via enqueue_job, this works by allowing Operations to be executed over multiple VM cycles, so an Operation may start some async work in one step and then further steps can poll for completion of that work and finish the Operation.

In particular this works by allowing Call Operations to return an OpStatus::Pending value that indicates that the same Call operation needs to be executed repeatedly, until it returns an OpStatus::Finished status.

In the case of a Pending status, the program counter is reset and anything that was taken off the stack is pushed back so the same Operation can be re-executed.

There is a new NativeFunction::from_async_as_sync_with_captures() that lets the host provide a (sync) closure that itself returns / spawns a boxed Future. This is tracked internally as an Inner::AsyncFn.

Whenever the function is __call__ed then (assuming the operation isn't already in a pending / running state) a new Future is spawned via the application's closure and the Operation enters a "pending" state.

When a NativeFunction is pending then each __call__ will poll() the spawned Future to see if the async function has a result.

This effectively stalls the VM at the same Opcode while still accounting for any cycle budget and periodically yielding to the application's async runtime while waiting for an async Call Operation to finish.

Limitations / Issues

Busy Loop Polling

Even though the implementation does yield back to the application's async runtime when waiting for a NativeFunction to complete, the implementation isn't ideal because it uses a noop task Context

  • Waker when polling NativeFunction Futures. The effectively relies on the VM polling the future in a busy loop, wasting CPU time.

A better solution could be to implement a shim Waker that would flag some state on the Boa engine Context, and then adapt the Future that's used to yield the VM to the executor so that it only becomes Ready once the async NativeFunction has signalled the waker. I.e. the Waker would act like a bridge/proxy between a spawned async NativeFunction and the the Future/Task associated with the VM's async run_async_with_budget.

This way I think the VM could remain async runtime agnostic but would be able to actually sleep while waiting for async functions instead of entering a busy yield loop.

Requires PC rewind and reverting stack state

Ideally operations that may complete over multiple steps would maintain a state machine via private registers, whereby it would not be necessary to repeatedly rewind the program counter and re-push values to the stack so that the operation can be decoded and executed repeatedly from the beginning.

Only adapts Call Operation

Currently only the Call Operation handles async NativeFunctions but there are other Call[XYZ] Operations that could be adapted too.

Not compatible with composite Operations that make function calls internally

The ability to track pending async functions is implemented in terms of repeatedly executing an Opcode in the VM until it signals that it's not Pending.

This currently relies on being able to reset and re-execute the Operation (such as reverting program counter and stack changes).

There are various Operations that make use of JsObject::call() internally and they would currently trigger a panic if they called an async NativeFunction because they wouldn't not be able to "resolve()" the "AsyncPending" status that would be returned by the call().

Ideally all Operations that use __call__ or __construct__ should support CallValue::AsyncPending and be fully resumable in the same way that the Call Operation is now.

This would presumably be easier to achieve with Rust Coroutines if they were stable because it would otherwise be necessary to adapt composite Operations into a state machine, similar to what the compiler does for an async Future, so they can yield for async function calls and be resumed by the VM.

Addresses: #3442

@rib rib force-pushed the rib/async-native-functions branch 2 times, most recently from 8a508a2 to 38f7f58 Compare May 14, 2025 20:04
This experiments with enabling support for async NativeFunctions that
are only async from the pov of the host, and appear as synchronous from
within JavaScript.

Instead of running the async functions as a Promise via enqueue_job,
this works by allowing Operations to be executed over multiple VM
cycles, so an Operation may start some async work in one step and then
further steps can poll for completion of that work and finish the Operation.

In particular this works by allowing Call Operations to return an
`OpStatus::Pending`value that indicates that the same Call
operation needs to be executed repeatedly, until it returns an
`OpStatus::Finished` status.

In the case of a `Pending` status, the program counter is reset
and anything that was taken off the stack is pushed back so the same
Operation can be re-executed.

There is a new `NativeFunction::from_async_as_sync_with_captures()`
that lets the host provide a (sync) closure that itself returns / spawns
a boxed Future. This is tracked internally as an `Inner::AsyncFn`.

Whenever the function is `__call__`ed then (assuming the operation isn't
already in a pending / running state) a new Future is spawned via the
application's closure and the Operation enters a "pending" state.

When a NativeFunction is pending then each `__call__` will `poll()` the
spawned `Future` to see if the `async` function has a result.

This effectively stalls the VM at the same Opcode while still accounting
for any cycle budget and periodically yielding to the application's
async runtime while waiting for an async Call Operation to finish.

Limitations / Issues
====================

== Busy Loop Polling ==

Even though the implementation does yield back to the application's
async runtime when waiting for a NativeFunction to complete, the
implementation isn't ideal because it uses a noop task Context
+ Waker when polling NativeFunction Futures. The effectively relies on
the VM polling the future in a busy loop, wasting CPU time.

A better solution could be to implement a shim Waker that would flag
some state on the Boa engine Context, and then adapt the Future that's
used to yield the VM to the executor so that it only becomes Ready once
the async NativeFunction has signalled the waker. I.e. the Waker would
act like a bridge/proxy between a spawned async NativeFunction and the
the Future/Task associated with the VM's async `run_async_with_budget`.

This way I think the VM could remain async runtime agnostic but would
be able to actually sleep while waiting for async functions instead
of entering a busy yield loop.

== Requires PC rewind and reverting stack state ==

Ideally operations that may complete over multiple steps would maintain
a state machine via private registers, whereby it would not be necessary
to repeatedly rewind the program counter and re-push values to the stack
so that the operation can be decoded and executed repeatedly from the
beginning.

== Only adapts Call Operation ==

Currently only the Call Operation handles async NativeFunctions but
there are other Call[XYZ] Operations that could be adapted too.

== Not compatible with composite Operations that `call()` ==

The ability to track pending async functions is implemented in terms of
repeatedly executing an Opcode in the VM until it signals that it's not
Pending.

This currently relies on being able to reset and re-execute the
Operation (such as reverting program counter and stack changes).

There are lots of Operations that make use of JsObject::call()
internally and they would currently trigger a panic if they called an
async NativeFunction because they would not be able to "resolve()"
the "Pending" status that would be returned by the `call()`.

Ideally all Operations that use `__call__` or `__construct__` should
be fully resumable in the same way that the Call Operation is now.

This would presumably be easier to achieve with Rust Coroutines if they
were stable because it would otherwise be necessary to adapt composite
Operations into a state machine, similar to what the compiler does for
an async Future, so they can yield for async function calls and be
resumed by the VM.
@rib rib force-pushed the rib/async-native-functions branch from 38f7f58 to 9f8760e Compare May 14, 2025 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant