Skip to content

Implement persistent store for documents #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 44 commits into
base: main
Choose a base branch
from

Conversation

jsparber
Copy link
Collaborator

@jsparber jsparber commented May 7, 2025

This adds storing document and authors presistently on disk.

I didn't do a lot of testing yet, but should be mostly fine to merge.

jsparber added 6 commits May 5, 2025 14:41
The aardvark-doc doesn't need/use tokio, since aardvark-node exposes a
async-std API.
This ensures that operations are blocked and stored till the node is
running. They will be executed in call order once the node is running.
In a future commit the DocumentStore will be written to an sqlite DB,
but the document_tx hashmap won't be persistent.
Logically it makes sense that we map documents to a set of authors
instead of the other way around. Additionally this makes the
implementation of TopicLogMap trait much simpler without the need to
look at authors that aren't authors of a specific document.
This probably doesn't fully unsubscribe since p2panda doesn't really
implement it, but at least we don't store any updates to the
OperationStore.
See for more details: p2panda/p2panda#639
@jsparber
Copy link
Collaborator Author

jsparber commented May 7, 2025

I'm seeing a couple of the following errors @adzialocha any idea:

2025-04-26T14:24:31.998642Z ERROR aardvark_node::network: ingesting operation failed: critical storage failure: an error occurred with the sqlite database: error returned from database: (code: 1555) UNIQUE constraint failed: operations_v1.hash

@adzialocha adzialocha self-requested a review May 7, 2025 15:19
jsparber added 21 commits May 8, 2025 10:58
This exposes the PublicKey and PrivateKey structs via boxed glib types
so they can be used as GObject properties.
This uses oo7 to store the private key ("identity") to the keyring.
This will be used to store timestamps to the sqlite DB.
Sqlx will be used for the persistent store of operations and other data.
This feature will be used in a future commit to make use of the sqlite
operation store.
This uses the sqlite implementation of p2panda's OperationStore instead
of the in-memory store.
Note that if no path for the DB is set a in-memory sqlite database is
created.
We don't use the default feature of the p2panda store anymore since we
now use the sqlite OperationStore instead of the memory OperationStore.
This allows combining the our migrations with the migrations from
p2panda operation store.
This allows the client code to decide whether to block on shutdown. This
also makes shutdown consistent with the run method.
This add types and implements traits needed to write and read documents
and authors to and from the sqlite DB.
The DocumentStore keeps persistently track of:
- Document ID
- Document authors with last seen
- Last access to a document
- Name of the document
This extracts the name from the first line of the document.
This property keeps track of the last time the document was subscribed
to. Subscription will be added in a future commit.
When the subscribe property is `true` the Document is kept in sync with
other peers. If it's `false` not changes of the Document will be written
to the network nor the local DB.
Since we now require documents to be subscribed to tests are failing.
This makes sure that we unsubscribe form a document when the window is
closed or a new document is opened.
The node does only store the OperationStore for now and doesn't reload
operations.
Since we require now a data dir for a Service we need to set it in the
tests as well.
This allows to load the authors from the DB provided by the node.
This allows setting the authors from the state stored in the local DB.
jsparber added 4 commits May 8, 2025 12:33
This adds methods to create a document from an already known document
loaded from the local DB.
This creates documents and authors for the existing state loaded from
the local DB.
This adds a document list popover to the open button.
This makes sure that we don't create refcycle and the window obj is
disposed.
@jsparber jsparber force-pushed the jsparber/operation_store branch from 33c8a91 to 795e3e0 Compare May 8, 2025 10:34
@jsparber
Copy link
Collaborator Author

jsparber commented May 8, 2025

I'm seeing a couple of the following errors @adzialocha any idea:

2025-04-26T14:24:31.998642Z ERROR aardvark_node::network: ingesting operation failed: critical storage failure: an error occurred with the sqlite database: error returned from database: (code: 1555) UNIQUE constraint failed: operations_v1.hash

I think i found out why I see this error. Loro creats probably twice the same snapshot so that we get two operations with the same hash. This also makes the documents get out of sync. I will look into it some more tomorrow.

@adzialocha
Copy link
Member

I'm seeing a couple of the following errors @adzialocha any idea:

2025-04-26T14:24:31.998642Z ERROR aardvark_node::network: ingesting operation failed: critical storage failure: an error occurred with the sqlite database: error returned from database: (code: 1555) UNIQUE constraint failed: operations_v1.hash

I think i found out why I see this error. Loro creates probably twice the same snapshot so that we get two operations with the same hash. This also makes the documents get out of sync. I will look into it some more tomorrow.

Yes, I agree! It's probably the store inserting the same operation multiple times. I couldn't find the issue yet after a first scan, but it definitely smells like something is done too many times.

We still want to gossip changes even if we can't store the snapshot.
There is definitely something wrong but at least this way the document
doesn't get out of sync.
@jsparber
Copy link
Collaborator Author

jsparber commented May 9, 2025

I'm seeing a couple of the following errors @adzialocha any idea:

2025-04-26T14:24:31.998642Z ERROR aardvark_node::network: ingesting operation failed: critical storage failure: an error occurred with the sqlite database: error returned from database: (code: 1555) UNIQUE constraint failed: operations_v1.hash

I think i found out why I see this error. Loro creates probably twice the same snapshot so that we get two operations with the same hash. This also makes the documents get out of sync. I will look into it some more tomorrow.

Yes, I agree! It's probably the store inserting the same operation multiple times. I couldn't find the issue yet after a first scan, but it definitely smells like something is done too many times.

Found a workaround for now that makes aardvark work, but definitely we do something wrong. 90ef375

Also we should send the snapshot on update, essentially we need to fix the TODO https://github.com/p2panda/aardvark/blob/main/aardvark-doc/src/document.rs#L225

jsparber added 12 commits May 18, 2025 13:19
`p2panda_stream::operation::ingest_operation()` does additional checks
we don't need since we can always trust our own operation sequence. This
speeds up the operation creation significantly.
We need to make sure that we create operations in sequence and not in
parallel.
This reduces the number of snapshots we created.
This fixes some issues when importing remote exports.
This ensures that all signals and changes to the LoroDocument are done
on the main thread.
We need signals to be emitted in the main context where the main loop is
running. The easiest way is to change the document only from the main
context.
This insures that the LoroDocument and the TextBuffer are never in
invalid state.
Since all changes that are received from the node are now invoked on the
main context we don't need to idle spawn changes to authors.
The gossip system events don't provide information about all authors
only about the closest few. This ensures that we add new authors, but
the connection state may be wrong.
@jsparber
Copy link
Collaborator Author

Pushed the changes i had in jsparber/less_snapshots to this MR. I think this is ready to be merged after a quick review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants