Prototype support for async native functions #4237

rib · 2025-05-09T10:24:03Z

Based on some discussion in #3442 (comment); this experiments with enabling support for async NativeFunctions that are only async from the pov of the host, and appear as synchronous from within JavaScript.

Instead of running the async functions as a Promise via enqueue_job, this works by allowing Operations to be executed over multiple VM cycles, so an Operation may start some async work in one step and then further steps can poll for completion of that work and finish the Operation.

In particular this works by allowing Call Operations to return an OpStatus::Pending value that indicates that the same Call operation needs to be executed repeatedly, until it returns an OpStatus::Finished status.

In the case of a Pending status, the program counter is reset and anything that was taken off the stack is pushed back so the same Operation can be re-executed.

There is a new NativeFunction::from_async_as_sync_with_captures() that lets the host provide a (sync) closure that itself returns / spawns a boxed Future. This is tracked internally as an Inner::AsyncFn.

Whenever the function is __call__ed then (assuming the operation isn't already in a pending / running state) a new Future is spawned via the application's closure and the Operation enters a "pending" state.

When a NativeFunction is pending then each __call__ will poll() the spawned Future to see if the async function has a result.

This effectively stalls the VM at the same Opcode while still accounting for any cycle budget and periodically yielding to the application's async runtime while waiting for an async Call Operation to finish.

Limitations / Issues

Busy Loop Polling

Even though the implementation does yield back to the application's async runtime when waiting for a NativeFunction to complete, the implementation isn't ideal because it uses a noop task Context

Waker when polling NativeFunction Futures. The effectively relies on the VM polling the future in a busy loop, wasting CPU time.

A better solution could be to implement a shim Waker that would flag some state on the Boa engine Context, and then adapt the Future that's used to yield the VM to the executor so that it only becomes Ready once the async NativeFunction has signalled the waker. I.e. the Waker would act like a bridge/proxy between a spawned async NativeFunction and the the Future/Task associated with the VM's async run_async_with_budget.

This way I think the VM could remain async runtime agnostic but would be able to actually sleep while waiting for async functions instead of entering a busy yield loop.

Requires PC rewind and reverting stack state

Ideally operations that may complete over multiple steps would maintain a state machine via private registers, whereby it would not be necessary to repeatedly rewind the program counter and re-push values to the stack so that the operation can be decoded and executed repeatedly from the beginning.

Only adapts Call Operation

Currently only the Call Operation handles async NativeFunctions but there are other Call[XYZ] Operations that could be adapted too.

Not compatible with composite Operations that make function calls internally

The ability to track pending async functions is implemented in terms of repeatedly executing an Opcode in the VM until it signals that it's not Pending.

This currently relies on being able to reset and re-execute the Operation (such as reverting program counter and stack changes).

There are various Operations that make use of JsObject::call() internally and they would currently trigger a panic if they called an async NativeFunction because they wouldn't not be able to "resolve()" the "AsyncPending" status that would be returned by the call().

Ideally all Operations that use __call__ or __construct__ should support CallValue::AsyncPending and be fully resumable in the same way that the Call Operation is now.

This would presumably be easier to achieve with Rust Coroutines if they were stable because it would otherwise be necessary to adapt composite Operations into a state machine, similar to what the compiler does for an async Future, so they can yield for async function calls and be resumed by the VM.

Addresses: #3442

This experiments with enabling support for async NativeFunctions that are only async from the pov of the host, and appear as synchronous from within JavaScript. Instead of running the async functions as a Promise via enqueue_job, this works by allowing Operations to be executed over multiple VM cycles, so an Operation may start some async work in one step and then further steps can poll for completion of that work and finish the Operation. In particular this works by allowing Call Operations to return an `OpStatus::Pending`value that indicates that the same Call operation needs to be executed repeatedly, until it returns an `OpStatus::Finished` status. In the case of a `Pending` status, the program counter is reset and anything that was taken off the stack is pushed back so the same Operation can be re-executed. There is a new `NativeFunction::from_async_as_sync_with_captures()` that lets the host provide a (sync) closure that itself returns / spawns a boxed Future. This is tracked internally as an `Inner::AsyncFn`. Whenever the function is `__call__`ed then (assuming the operation isn't already in a pending / running state) a new Future is spawned via the application's closure and the Operation enters a "pending" state. When a NativeFunction is pending then each `__call__` will `poll()` the spawned `Future` to see if the `async` function has a result. This effectively stalls the VM at the same Opcode while still accounting for any cycle budget and periodically yielding to the application's async runtime while waiting for an async Call Operation to finish. Limitations / Issues ==================== == Busy Loop Polling == Even though the implementation does yield back to the application's async runtime when waiting for a NativeFunction to complete, the implementation isn't ideal because it uses a noop task Context + Waker when polling NativeFunction Futures. The effectively relies on the VM polling the future in a busy loop, wasting CPU time. A better solution could be to implement a shim Waker that would flag some state on the Boa engine Context, and then adapt the Future that's used to yield the VM to the executor so that it only becomes Ready once the async NativeFunction has signalled the waker. I.e. the Waker would act like a bridge/proxy between a spawned async NativeFunction and the the Future/Task associated with the VM's async `run_async_with_budget`. This way I think the VM could remain async runtime agnostic but would be able to actually sleep while waiting for async functions instead of entering a busy yield loop. == Requires PC rewind and reverting stack state == Ideally operations that may complete over multiple steps would maintain a state machine via private registers, whereby it would not be necessary to repeatedly rewind the program counter and re-push values to the stack so that the operation can be decoded and executed repeatedly from the beginning. == Only adapts Call Operation == Currently only the Call Operation handles async NativeFunctions but there are other Call[XYZ] Operations that could be adapted too. == Not compatible with composite Operations that `call()` == The ability to track pending async functions is implemented in terms of repeatedly executing an Opcode in the VM until it signals that it's not Pending. This currently relies on being able to reset and re-execute the Operation (such as reverting program counter and stack changes). There are lots of Operations that make use of JsObject::call() internally and they would currently trigger a panic if they called an async NativeFunction because they would not be able to "resolve()" the "Pending" status that would be returned by the `call()`. Ideally all Operations that use `__call__` or `__construct__` should be fully resumable in the same way that the Call Operation is now. This would presumably be easier to achieve with Rust Coroutines if they were stable because it would otherwise be necessary to adapt composite Operations into a state machine, similar to what the compiler does for an async Future, so they can yield for async function calls and be resumed by the VM.

rib mentioned this pull request May 9, 2025

Cancel/Interrupt active evaluation #3442

Open

rib force-pushed the rib/async-native-functions branch 2 times, most recently from 8a508a2 to 38f7f58 Compare May 14, 2025 20:04

rib force-pushed the rib/async-native-functions branch from 38f7f58 to 9f8760e Compare May 14, 2025 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Prototype support for async native functions #4237

Prototype support for async native functions #4237

Uh oh!

rib commented May 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Prototype support for async native functions #4237

Are you sure you want to change the base?

Prototype support for async native functions #4237

Uh oh!

Conversation

rib commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Limitations / Issues

Busy Loop Polling

Requires PC rewind and reverting stack state

Only adapts Call Operation

Not compatible with composite Operations that make function calls internally

Uh oh!

Uh oh!

rib commented May 9, 2025 •

edited

Loading