-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading WasmMemory into module #152
Comments
One more experiment:
- it seems the the initial buffer ( |
In your first example, you create two VMs. Each VM instance is independent and has its own environment, even within the same WebAssembly memory. If you run the same logic inside one We Assembly module, you will get the same result. You need to re-use the memory address of vm1 in the cloned WebAssembly module so that you are using vm1, instead of creating vm2.
If I understand correctly, I think this would be expected because these two objects are the same object. I think it works similar to this: const mem1 = []
const qjs1 = { mem: mem1 }
qjs1.mem.push(‘vm1’)
_.equal(mem1, qjs1.mem) // true
mem1 === qjs1.mem // also true |
You can get the memory address of various objects like QuickJSContext, QuickJSRuntime by inspecting their private internals looking for pointer types. You’ll need to use the private constructor to create clones of the objects with the same address but referencing the new WebAssembly module via existing private APIs. |
Yeah, I've just realized that I should copy the original memory's buffer and compare it with a buffer from the |
hmm...this looks a bit complicated to me...any help here would appreciated, though I obviously understand that you have better things to do ;) |
ok, here's what I've got so far
with the output being:
Not sure if I can create a completely new runtime and attach a context that points to the memory of the old context - or maybe the runtime should be also pointing to the memory of the old runtime? |
Another try - by passing pointer to the context in the
- but with a similar result:
|
One last try - with creating both runtime and context from pointers:
with output:
Any suggestions would be great :) I've also run the code with the debug variant, i.e. - it produces this output:
|
Once you do
you can use |
Yup, that's why I'm disposing vm1 after storing the pointer values:
- but I'll try to remove all What I've managed to make working so far is a second vm created from the pointer of the first vm, but within the same runtime and module:
I'll keep on experimenting, thanks for your suggestions! |
Ok, I think I've managed to make it work, at least on the most simple, PoC-level example. Pasting here in case someone would need to do sth similar.
import { newQuickJSWASMModule, newVariant } from "quickjs-emscripten";
import releaseSyncVariant from "@jitl/quickjs-singlefile-mjs-release-sync";
import fs from "fs";
async function main() {
// module 1
const mem1 = new WebAssembly.Memory({
initial: 256, //*65536
maximum: 2048 //*65536
});
const variant1 = newVariant(releaseSyncVariant, {
wasmMemory: mem1
});
const QuickJS1 = await newQuickJSWASMModule(variant1);
// runtime 1
const runtime1 = QuickJS1.newRuntime();
// vm1
const vm1 = runtime1.newContext();
const res1 = vm1.evalCode(`let x = 100;
function add() {
x += 50;
return x;
};
`);
res1.value.dispose();
const testRes = vm1.unwrapResult(vm1.evalCode(`add()`));
console.log("add result (should be 150):", vm1.getNumber(testRes));
testRes.dispose();
// storing vm1 and runtime 1 pointers
const vm1Ptr = vm1.ctx.value;
const rt1Ptr = vm1.rt.value;
console.log({ vm1Ptr, rt1Ptr });
fs.writeFileSync("ptrs.json", JSON.stringify({ vm1Ptr, rt1Ptr }));
// storing module 1 memory in file
const buffer = QuickJS1.getWasmMemory().buffer;
fs.writeFileSync("wasmMem.dat", new Uint8Array(buffer));
// it is now safe to dispose vm1 and runtime1
vm1.dispose();
runtime1.dispose();
}
main().catch(e => console.error(e)).finally();
import {
Lifetime,
newQuickJSWASMModule,
newVariant,
QuickJSRuntime,
RELEASE_SYNC
} from "quickjs-emscripten";
import debugSyncVariant from "@jitl/quickjs-singlefile-mjs-debug-sync";
import releaseSyncVariant from "@jitl/quickjs-singlefile-mjs-release-sync";
import fs from "fs";
async function main() {
// reading memory from file, creating new Memory instance
// and copying contents of the first module's memory into it
const memoryBuffer = fs.readFileSync("wasmMem.dat");
const existingBufferView = new Uint8Array(memoryBuffer);
const pageSize = 64 * 1024;
const numPages = Math.ceil(memoryBuffer.byteLength / pageSize);
const newWasmMemory = new WebAssembly.Memory({
initial: numPages,
maximum: 2048
});
const newWasmMemoryView = new Uint8Array(newWasmMemory.buffer);
newWasmMemoryView.set(existingBufferView);
// module 2
const variant2 = newVariant(releaseSyncVariant, {
wasmMemory: newWasmMemory
});
const { rt1Ptr, vm1Ptr } = JSON.parse(fs.readFileSync("ptrs.json", "utf-8"));
const QuickJS2 = await newQuickJSWASMModule(variant2);
// creating runtime from rt1Ptr pointer
const rt = new Lifetime(rt1Ptr, undefined, (rt_ptr) => {
QuickJS2.callbacks.deleteRuntime(rt_ptr)
QuickJS2.ffi.QTS_FreeRuntime(rt_ptr)
})
const runtime2 = new QuickJSRuntime({
module: QuickJS2.module,
callbacks: QuickJS2.callbacks,
ffi: QuickJS2.ffi,
rt
});
// creating context from vm1 ptr
const vm2 = runtime2.newContext({
contextPointer: vm1Ptr
});
const testRes2 = vm2.unwrapResult(vm2.evalCode(`add()`));
console.log("add result 2 (should be 200):", vm2.getNumber(testRes2));
testRes2.dispose();
vm2.dispose();
runtime2.dispose();
}
main().catch(e => console.error(e)).finally(); Few notes:
Thanks @justjake for your help and all your suggestions! EDIT - one last note - the issue with wasm memory dump size can be easily 'fixed' with compression:
const buffer = QuickJS1.getWasmMemory().buffer;
const compressionStream = new CompressionStream('gzip');
const uint8Buffer = new Uint8Array(buffer);
const stream = new ReadableStream({
start(controller) {
controller.enqueue(uint8Buffer);
controller.close();
},
});
const compressedStream = stream.pipeThrough(compressionStream);
const compressedBuffer = await new Response(compressedStream).arrayBuffer();
fs.writeFileSync("wasmMem.dat", new Uint8Array(compressedBuffer));
const compressedBuffer = fs.readFileSync("wasmMem.dat");
const compressedBufferView = new Uint8Array(compressedBuffer);
const decompressionStream = new DecompressionStream('gzip');
const compressedStream = new ReadableStream({
start(controller) {
controller.enqueue(compressedBufferView);
controller.close();
},
});
const decompressedStream = compressedStream.pipeThrough(decompressionStream);
const decompressedBuffer = await new Response(decompressedStream).arrayBuffer();
const memoryBuffer = new Uint8Array(decompressedBuffer);
const pageSize = 64 * 1024;
const numPages = Math.ceil(memoryBuffer.byteLength / pageSize);
const newWasmMemory = new WebAssembly.Memory({
initial: numPages,
maximum: 2048
});
const newWasmMemoryView = new Uint8Array(newWasmMemory.buffer);
newWasmMemoryView.set(memoryBuffer);
// module 2
const variant2 = newVariant(releaseSyncVariant, {
wasmMemory: newWasmMemory
}); In case of the above example it reduces memory size from ~16mb to ~87kb. |
@ppedziwiatr & @justjake Thanks for the work here, this is really impressive. I'm considering using this approach as we need to polyfill quick js with localization capabilities. But the polyfilling process is very slow, too slow to run per script evaluation. So the use case is to restore a version of quickJS which has the required polyfills in place and use this to execute scripts for different end users in our system. We apply the polyfills then save the memory per your example, on the next evaluation we restore quickJS per your example and have a polyflled quickjs ready to use. It seems to work really well. 👏 I'm wondering about the isolation properties of this approach. Would it be safe to share this memory restored instance between users of our system? ( we want to ensure that two users are running in isolated scripting environments ) |
IMO - if you use the polyfilled version as a "starting point" and copy it for each user (so that each user is operating on their own copy) - then it should be safe? Btw. (as a warn) - the above aproach does not work very if you need to bind any functions from host - it is impossible to 'restore' such bindings, and binding each time from scratch was quickly causing some memory/pointers issues (can't remember exactly..)..that's why in the end we've dropped this solution.. |
@justjake I'm keen to use this approach as it solves the slow polyfill of Intl packages, but am worried as it relies on protected properties. Would love to know your take this approach, and if you'd consider exposing these properties. ![]() ![]() I guess I'm really after a way to clone context 🤔 |
So just to confirm (apologies, I'm inexperienced with this emscripten stuff): Is the summary of this thread that it's not really possible to create a "snapshot" of a module (or runtime, or context) which can e.g. be saved to disk, and later loaded? I for some reason thought this was one of the key use cases of QuickJS, and had been designing some features of a game engine based around this. I tried code based on ppedziwiatr's example, but got an error about an invalid function reference when trying ChatGPT is telling me that I'd need to also somehow copy:
And so it basically says that it's not feasible - does anyone know if that is roughly correct? |
Okay so it turns out that the QuickJS package/module or some part of it (e.g. emscripten) is keeping some 'global' state which changes the internal init of memory/pointers/something for each successive QuickJS module/runtime you create. So to avoid this, you can instantiate each QuickJS module/runtime in its own Web Worker, and then snapshotting works as expected. The disadvantage of this is that it gives the QuickJS instance an async API. Unless @justjake has a solution here, then if you need a sync API you might have to wait for ShadowRealm, which would prevent emscripten/QuickJS from being "stateful", in the sense that it currently seems to be. Working Snapshot/Restore Example:You can paste this in your browser console to test (uses async function createRuntime({ code, snapshot, maxMemoryBytes=1024*1024*1024 }) {
if (!code && !snapshot) throw new Error('Either code or snapshot must be provided');
const workerCode = `
import { newQuickJSWASMModuleFromVariant, setDebugMode, newVariant, Lifetime, QuickJSRuntime } from "https://esm.sh/[email protected]";
import releaseVariant from "https://esm.sh/@jitl/[email protected]";
// setDebugMode(true);
self.onmessage = async ({ data: { type, code, snapshot } }) => {
try {
// Determine upfront how much initial memory we'll need, in case we're loading from snapshot:
let initialMemoryPages = 256;
let snapshotMemoryBuffer;
if(snapshot) {
if(snapshot.compressed) {
snapshotMemoryBuffer = await new Response(new Blob([snapshot.memoryBuffer]).stream().pipeThrough(new DecompressionStream("gzip"))).arrayBuffer();
} else {
snapshotMemoryBuffer = snapshot.memoryBuffer;
}
initialMemoryPages = Math.max(initialMemoryPages, snapshotMemoryBuffer.byteLength / (64*1024));
}
// Create the QuickJS module:
let wasmMemory = new WebAssembly.Memory({ initial:initialMemoryPages, maximum:${Math.ceil(maxMemoryBytes/(64*1024))} });
const variant = newVariant(releaseVariant, {wasmMemory});
const QuickJS = await newQuickJSWASMModuleFromVariant(variant);
// Create the vm:
let runtime, vm;
if (snapshot) {
const newMemView = new Uint8Array(wasmMemory.buffer);
const oldMemView = new Uint8Array(snapshotMemoryBuffer);
newMemView.set(oldMemView);
// Restore runtime from pointer (not sure if the pointer stuff is even needed?)
const rt = new Lifetime(snapshot.runtimePointer, undefined, (rt_ptr) => {
QuickJS.callbacks.deleteRuntime(rt_ptr);
QuickJS.ffi.QTS_FreeRuntime(rt_ptr);
});
runtime = new QuickJSRuntime({
module: QuickJS.module,
callbacks: QuickJS.callbacks,
ffi: QuickJS.ffi,
rt
});
// Restore context from pointer
vm = runtime.newContext({
contextPointer: snapshot.contextPointer,
});
} else {
// Create fresh runtime and context
runtime = QuickJS.newRuntime();
vm = runtime.newContext();
// Initialize with code
const result = vm.evalCode(\`globalThis.self=globalThis;\${code}\`);
if (result.error) {
let error = vm.dump(result.error);
result.error.dispose();
if(!error) error = {type:"UnknownError", message:"Possibly memory limit was reached?"};
throw new Error(\`Failed to initialize VM: \${error.type} \${error.message}\`);
}
result.value.dispose();
}
// Send back initial state and handle future commands
self.onmessage = async ({ data: { requestId, cmd, args } }) => {
try {
switch (cmd) {
case 'eval': {
const result = vm.evalCode(args.code);
if (result.error) {
let error = vm.dump(result.error);
result.error.dispose();
if(!error) error = {type:"UnknownError", message:"Possibly memory limit was reached?"};
throw new Error(\`Evaluation failed: \${error.type} \${error.message}\`);
}
const value = vm.dump(result.value);
result.value.dispose();
self.postMessage({ requestId, type:'response', value });
break;
}
case 'snapshot': {
let memoryBuffer;
if(args.compressed) {
memoryBuffer = await new Response(new Blob([wasmMemory.buffer]).stream().pipeThrough(new CompressionStream('gzip'))).arrayBuffer();
} else {
memoryBuffer = wasmMemory.buffer.slice(0);
}
const snapshot = {
memoryBuffer,
compressed: !!args.compressed,
runtimePointer: vm.rt.value,
contextPointer: vm.ctx.value
};
self.postMessage({ requestId, type:'response', value:snapshot }, [snapshot.memoryBuffer]);
break;
}
case 'dispose': {
try {
vm.dispose();
runtime.dispose();
} catch(e) {
console.log("Error while disposing vm/runtime:", e);
}
self.close();
break;
}
}
} catch (error) {
self.postMessage({ requestId, type:'response', value:error.message, isError:true });
}
};
self.postMessage({ type: 'vm_ready' });
} catch (error) {
self.postMessage({ type:'error', error:error.message });
self.close();
}
};
`;
// Create worker from blob URL
const blob = new Blob([workerCode], { type: 'text/javascript' });
const workerUrl = URL.createObjectURL(blob);
const worker = new Worker(workerUrl, { type: 'module' });
URL.revokeObjectURL(workerUrl);
// Init vm:
await new Promise((resolve, reject) => {
worker.onmessage = ({ data }) => {
if (data.type === 'vm_ready') resolve();
else if (data.type === 'error') reject(new Error(data.error));
};
worker.postMessage({ type: 'init_vm', code, snapshot });
});
let disposed = false;
// Detect undisposed runtime. For some reason this needs to be global (I guess it gets GC'ed before `api` does - and nope... it seems it's not possible for a finalization registry to register itself).
let finalizationRegistryId = Math.random().toString()+Math.random().toString();
window[`__runtimeWorkerRegistry__${finalizationRegistryId}`] = new FinalizationRegistry((worker) => {
if(!disposed) {
console.warn(`createRuntime: Cleaning up undisposed runtime. For performance/efficiency, you should call runtime.dispose() after you've finished using it.`);
worker.terminate();
}
delete window[`__runtimeWorkerRegistry__${finalizationRegistryId}`];
});
let messageIdToResolvers = {};
worker.addEventListener("message", ({ data }) => {
if (data.type === 'response') {
messageIdToResolvers[data.requestId][data.isError ? "reject" : "resolve"](data.isError ? new Error(data.value) : data.value);
delete messageIdToResolvers[data.requestId];
} else {
throw new Error(`Unknown worker message type: ${data.type}`);
}
});
let api = {
eval: (code) => {
if(disposed) throw new Error('Runtime has been disposed');
let requestId = Math.random().toString()+Math.random().toString();
worker.postMessage({ cmd: 'eval', args:{code}, requestId:requestId });
return new Promise((resolve, reject) => messageIdToResolvers[requestId]={resolve, reject});
},
getMemorySnapshot: ({compressed=false}={}) => {
if(disposed) throw new Error('Runtime has been disposed');
let requestId = Math.random().toString()+Math.random().toString();
worker.postMessage({ cmd: 'snapshot', args:{compressed}, requestId:requestId });
return new Promise((resolve, reject) => messageIdToResolvers[requestId]={resolve, reject});
},
dispose: () => {
disposed = true;
worker.postMessage({ cmd: 'dispose' });
},
};
window[`__runtimeWorkerRegistry__${finalizationRegistryId}`].register(api, worker);
return api;
}
////////////////////////////////////////////
// EXAMPLE USAGE //
////////////////////////////////////////////
const code = `
let foo = 1;
let obj = {a:1};
function doStuff(n) {
foo += n;
obj.a++;
obj.fn = () => 123;
return foo > 4 ? obj : foo;
}
`;
// Create initial runtime:
const runtime = await createRuntime({ code });
console.log(await runtime.eval("foo")); // 1
console.log("doStuff(3)", await runtime.eval("doStuff(3)"));
console.log(await runtime.eval("foo")); // 4
// Snapshot and delete initial runtime:
const snapshot = await runtime.getMemorySnapshot(); // pass {compressed:true} for compressed snapshot
runtime.dispose();
// Restore runtime and continue:
const restoredRuntime = await createRuntime({ snapshot });
console.log(await restoredRuntime.eval("foo")); // Should show 4
console.log("doStuff(2)", await restoredRuntime.eval("doStuff(2)"));
console.log(await restoredRuntime.eval("foo")); // Should show 6
console.log(await restoredRuntime.eval("obj.fn()")); // Should show 123
restoredRuntime.dispose(); |
Interesting finding with the WebWorker to get a completely fresh copy of the JavaScript environment state. I think it should be possible to overcome the issues with Emscripten global state, but it will take some investigation of Emscripten’s JS wrapper. My recommendation to continue this work is:
I think if you can figure out those things work mechanically at the Emscripten level, we can figure out a high level interface for full serialization to disk that we can expose to quickjs-emscripten users, as well as a way to get the wrapper code changes to work with the build system somehow. |
Hey,
as a follow-up of #146 - here's code that I'm trying to run:
What it does - it simply creates one quickjs module, evals some code, stores the module's Wasm memory - and then a second module is created with a Wasm memory from the first one. I would expect that all the code evaluated in the first one will be available in the second - but that's not the case. The result is:
Is this expected behaviour? If so, is it possible to somehow save a state of a module and resume the evaluation later with this exact state?
The text was updated successfully, but these errors were encountered: