You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was reviewing the On-Device Web Speech proposal (cc @evanbliu), which has a similar need to manage a downloadable chunk of data, and I think we should ensure that the shape of the API here covers features like that one, even if Web Speech has backward-compatibility constraints that prevent it from actually adopting this shape.
In particular, the Speech API gives the user the ability to refuse the download and do the processing in the cloud instead. Websites with particular privacy requirements might want to request that the browser only do the processing locally, but a user with a small or slow device might prefer to override the website's request.
For a not-yet-downloaded model that's also available in the cloud, I assume the current availability() would return "after-download". Calling create() would show the user a prompt to let them pick between downloading and sending data to the cloud. (Should the website be able to pass in an estimate of how much data it's planning to send into the model, so users can trade off download vs runtime data transfer?) If the user picks either 'download' or 'use the cloud', availability() would transition to "readily".
There's no axis there to express the privacy difference: should availability() instead return a dictionary with an extra field to explain where the data might implicitly go?
For a site that prefers local processing, it should be able to express that to create(). Web Speech currently proposes a mode option that takes "ondevice-preferred", "ondevice-only", or "cloud-only", but I think "cloud-only" is a mistake and that this could be a boolean.
This doesn't affect the API shape much, but it's interesting that the cloud option is significantly less fingerprintable than the local version, both with respect to the 1 bit for each model downloaded and the several bits potentially provided by the version of each cached model.
The text was updated successfully, but these errors were encountered:
Good stuff; converging these APIs and providing better control over cloud vs. on-device is on our roadmap. (Although the latter is less urgent since initial implementations are all on-device.)
There's no axis there to express the privacy difference: should availability() instead return a dictionary with an extra field to explain where the data might implicitly go?
This seems like a better fit for a separate sibling method or property, to me. availability() is envisioned to be used by website developers who might not want to incur downloads, or might want to provide differentiated user experiences depending on whether a download is required or not.
The question of where the data might go is somewhat orthogonal to that, more about privacy than about latency and user experience.
For a site that prefers local processing, it should be able to express that to create().
Thinking about this a bit more, I think the correct pattern is:
ai.whatever.create({mode: "on-device-only"});// fails if only cloud is supportedai.whatever.availability({mode: "on-device-only"});// returns one of the existing valuesai.whatever.availability({mode: "cloud-only"});// can only return "available" or "unavailable"
(modulo some bikeshedding around the mode name, the existence of "cloud-only" vs. a boolean, etc.)
This maintains the symmetry between create() and availability() options we've had so far, and seems like it lets you get all the information you'd need.
(Related to #29.)
I was reviewing the On-Device Web Speech proposal (cc @evanbliu), which has a similar need to manage a downloadable chunk of data, and I think we should ensure that the shape of the API here covers features like that one, even if Web Speech has backward-compatibility constraints that prevent it from actually adopting this shape.
In particular, the Speech API gives the user the ability to refuse the download and do the processing in the cloud instead. Websites with particular privacy requirements might want to request that the browser only do the processing locally, but a user with a small or slow device might prefer to override the website's request.
For a not-yet-downloaded model that's also available in the cloud, I assume the current
availability()
would return "after-download". Callingcreate()
would show the user a prompt to let them pick between downloading and sending data to the cloud. (Should the website be able to pass in an estimate of how much data it's planning to send into the model, so users can trade off download vs runtime data transfer?) If the user picks either 'download' or 'use the cloud',availability()
would transition to "readily".There's no axis there to express the privacy difference: should
availability()
instead return a dictionary with an extra field to explain where the data might implicitly go?For a site that prefers local processing, it should be able to express that to
create()
. Web Speech currently proposes amode
option that takes "ondevice-preferred", "ondevice-only", or "cloud-only", but I think "cloud-only" is a mistake and that this could be a boolean.This doesn't affect the API shape much, but it's interesting that the cloud option is significantly less fingerprintable than the local version, both with respect to the 1 bit for each model downloaded and the several bits potentially provided by the version of each cached model.
The text was updated successfully, but these errors were encountered: