Skip to content

Conversation

@TotallyNotChase
Copy link
Collaborator

@TotallyNotChase TotallyNotChase commented Dec 7, 2025

Overview

Adds ChromaDB integration to the node as well as primitives to Rholang.

  • Scaffold service
  • Add service methods
    • Create or update collection
    • Get collection metadata
    • Add or update document
    • Query documents
  • Hook service methods to Rholang primitives (define them in system processes)
    • Create or update collection
    • Get collection metadata
    • Add or update document
    • Query documents
  • Enable the primitives (define them in rho runtime)
    • Create or update collection
    • Get collection metadata
    • Add or update document
    • Query documents
  • Add basic tests

Notes

This will require adding parsers for lists and maps into rho types. Those are currently missing. Some of the argument types for the service will need to be list or maps. Currently, there aren't any system processes/services that require complex types - it seems.

Please make sure that this PR:

Bors cheat-sheet:

  • bors r+ runs integration tests and merges the PR (if it's approved),
  • bors try runs integration tests for the PR,
  • bors delegate+ enables non-maintainer PR authors to run the above.

StringMeta(String),
NumberMeta(i64),
NullMeta,
// TODO (chase): Support floating point numbers once Rholang does?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever in the roadmap?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean in the roadmap for Rholang?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Comment on lines +159 to +161
// TODO (chase): Do we need custom options? i.e custom database name, authentication method, and url?
// If the chroma db is hosted alongside the node locally, custom options don't make much sense.
let client = ChromaClient::new(ChromaClientOptions::default())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would custom options for database name, authentication method and URL ever be required? It wouldn't make sense for these to exist if the ChromaDB is being hosted alongside the node in a sort of tightly coupled manner.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make these configuration options that can be put in shared-rnode-runtime.conf and shared-rnode.conf

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the rust impl of Rholang doesn't have any hooks set up with the configurations - the only configurations it seems to take are from the CLI options.

It might be better to wait till the config infra is set up (assuming it's in the works) before adding configs. Otherwise, setting up the config infra will add some time on our end.

Comment on lines 265 to 270
// The embedding are currently auto-filled by a pre-chosen embedding function.
embeddings: None,
};

// We'll use OpenAI to generate embeddings.
let embeddingsf = OpenAIEmbeddings::new(Default::default());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here: The embeddings are currently auto generated instead of user-provided from Rholang code. This is to avoid requiring too much input data in Rholang code to drive up the script costs.

It currently chooses OpenAI to generate said embeddings. The choice is due to two reasons:

  • It's already supported by the chromaDB library
  • OpenAI API is already used elsewhere in this codebase (the openai service)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to support Rholang code provided embeddings - we should figure out how we want to represent the embeddings. Usually, they would be floating point numbers - but that's not supported in Rholang. It's possible to use integers as limited precision fractional numbers - not particularly clean though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would be good to support embeddings in Rholang, but I do think we should support BERT embeddings as well (see the original task description for details). We can allow the users to set which source of embeddings they'd like to use in the configuration files
React

Comment on lines +824 to +826
// TODO (chase): How to define overloads?
// This function can support 4 or 3 arguments (including ack) (second to last one is optional).
arity: 4,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rholang seems to support overloads for user defined "contracts" but I don't see a(n easy) way to support overloads for service methods. It'd be nice to support overloads for create_collection because its last parameter is an optional.

Not a high priority though, since the implementation treats supports both RNil and {} (empty map literal) as "no metadata argument provided"

}
}

pub struct RhoList;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why these didn't exist already - but they're helpful to be provided.

}

pub trait Extractor<RhoType> {
pub trait Extractor {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type parameter was entirely redundant. It might have been relevant for Scala, but it's not for Rust.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suspicious of this... Mind asking about this in the mlabs-sms channel in that Discord group I added you to?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the tests still run?

Comment on lines +181 to +192
// these bytes may need to change during finalization.
pub fn chroma_create_collection() -> Par {
byte_name(25)
}

pub fn chroma_get_collection_meta() -> Par {
byte_name(26)
}

pub fn chroma_upsert_entries() -> Par {
byte_name(27)
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will have to change after nunet implementation has been merged to prevent overlapping bytes.

With that said, the Nunet implementation uses weird byte names. They are not continuous. See: https://github.com/F1R3FLY-io/f1r3node/pull/240/files#diff-0ff9b3c4f760d8d6f04ee13fd2d33f196b7ac795019a91b5bc8202dce9626ecbR162

@TotallyNotChase TotallyNotChase marked this pull request as ready for review December 17, 2025 05:22
@TotallyNotChase
Copy link
Collaborator Author

I have added a bunch of runnable examples under rholang/examples/system-contract/chroma-db/. They'll have to be run one by one, sequentially (starting at 02 since 01 operates on a different collection altogether).

Ensure there is a chroma db running locally in the background, as well as OPENAI_API_KEY env var set to a valid key. Run each script with cargo run --bin rholang-cli -- ./examples/system-contract/chroma-db/02-create-collection-meta.rho etc (from within the rholang directory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants