Skip to content

Refactor C2D docs. #1513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Refactor C2D docs. #1513

wants to merge 11 commits into from

Conversation

mariacarmina
Copy link
Member

@mariacarmina mariacarmina commented May 12, 2025

Fixes #1515 .

Changes proposed in this PR:

  • adding sequence flow diagrams for free + paid compute
  • explainin the entire free and paid flow diagrams
  • refactor existing documentation for c2d
  • create documentation for escrow contract
  • update code example for c2d in ocean.js

@mariacarmina mariacarmina self-assigned this May 12, 2025
@mariacarmina mariacarmina marked this pull request as ready for review May 22, 2025 20:28
@mariacarmina mariacarmina changed the title C2D refactor for docs. Refactor C2D docs. May 22, 2025
Copy link

@giurgiur99 giurgiur99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes 🙏

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update with latest screenshot without paid resources? Thanks.

In the C2D workflow, the following steps are performed:

1. The consumer initiates a compute-to-data job by selecting the desired data asset and algorithm, and then, the orders are validated via the dApp used.
2. A dedicated and isolated execution pod is created for the C2D job.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Container not pod for now

3. The execution pod loads the specified algorithm into its environment.
4. The execution pod securely loads the selected dataset for processing.
5. The algorithm is executed on the loaded dataset within the isolated execution pod.
6. The results and logs generated by the algorithm are securely returned to the user.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can mention that we use a web3 auto-generated/custom PK and this is how we ensure private access to results


Now, let's delve into the inner workings of the Provider. Initially, it verifies whether the Consumer has sent the appropriate datatokens to gain access to the desired data. Once validated, the Provider interacts with the Operator-Service, a microservice responsible for coordinating the job execution. The Provider submits a request to the Operator-Service, which subsequently forwards the request to the Operator-Engine, the actual compute system in operation.

The Operator-Engine, equipped with functionalities like running Kubernetes compute jobs, carries out the necessary computations as per the requirements. Throughout the computation process, the Operator-Engine informs the Operator-Service of the job's progress. Finally, when the job reaches completion, the Operator-Engine signals the Operator-Service, ensuring that the Provider receives notification of the job's successful conclusion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No kubernetes for now

- `GetComputeEnvironments` - returns list of environments that can be selected to run the algorithm on
- `InitializeCompute` - generates provider fees necessary for asset's ordering
- `FreeStartCompute` - runs algorithms without necessary publish the assets on-chain (dataset and algorithm), using free resources from the selected environment
- `PaidStartCompute` - runs algorithms with on-chain assets (dataset and algorithm), using paid resources from the selected environment. The payment is requested at every start compute call, being handled by `Escrow` contract.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startCompute

COMPUTE_START: 'startCompute',

- `FreeStartCompute` - runs algorithms without necessary publish the assets on-chain (dataset and algorithm), using free resources from the selected environment
- `PaidStartCompute` - runs algorithms with on-chain assets (dataset and algorithm), using paid resources from the selected environment. The payment is requested at every start compute call, being handled by `Escrow` contract.
- `ComputeGetStatus` - retrieves compute job status.
- `ComputeStop` - stops compute job execution when the job is `Running`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  COMPUTE_STOP: 'stopCompute',

- `PaidStartCompute` - runs algorithms with on-chain assets (dataset and algorithm), using paid resources from the selected environment. The payment is requested at every start compute call, being handled by `Escrow` contract.
- `ComputeGetStatus` - retrieves compute job status.
- `ComputeStop` - stops compute job execution when the job is `Running`.
- `ComputeGetResult` - returns compute job results when job is `Finished`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMPUTE_GET_RESULT: 'getComputeResult',


One of its responsibility revolves around fetching and preparing the required assets and files, ensuring a smooth and seamless execution of the job. By meticulously handling the environment configuration, the **C2D Engine** guarantees that all necessary components are in place, setting the stage for a successful job execution.

1. **Fetching Dataset Assets**: It downloads the files corresponding to datasets and saves them in the location `/data/inputs/DID/`. The files are named based on their array index ranging from 0 to X, depending on the total number of files associated with the dataset.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dataset can be did/url/arweave/ipfs


## Prerequisites

The prerequisite for this flow is the algorithm code which can be input for consumers components: Ocean CLI and it is open for integration with other systems (e.g. Ocean Enterprise Marketplace).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI and vscode extension, might be useful to add links

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New dataset here will be url/did/ipfs/arweave.

We will now use the c2d_examples not algo_dockers repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create docs for new C2D free and paid flows
2 participants