GarNet / Redis / Other Storage Options #510

ericlacher · 2025-07-14T08:22:11Z

ericlacher
Jul 14, 2025

Hi Devs

we're currently evaluating to use Synapse in production. I have replaced GarNet with a GCP (Cloud MemoryStore) Redis instance, as this more seamlessly integrates with our infrastructure stacks (e.g. no persistent disk needed for garnet container) and reduces maintenance efforts.

Our workflows will (besides other things) download images from GCS and forward them to other workflows/steps in the output.
I curently see 2 options to do that:

Download in a step, encode as base64 string in output
Download in step, temporarily add to local disk in container, forward file paths in output

The issue I am seeing with 1. is the large amount of data which would be exposed in logs/events etc.
The issue with 2 is the constraint to stay inside a single runner container for the whole flow.

I am wondering, whether there's better options to avoid congested logs/events, and whether a different storage solution can be used (e.g. NoSQL DB). I have seen some code available here, but I am not sure how this could be used without changing the codebase.

So, the question is: How would you approach the challenge of large amounts of data, and how could I replace redis with another persistency option (or combine both).

bvandewe · 2025-07-14T09:24:25Z

bvandewe
Jul 14, 2025

Our prod workflows also move files (.zip) around (from generator to consumers) but never carry any workload data directly.

Instead, workflows command a 'transfer-service' to move a file from a source URL (i.e. HTTP endpoint served by the generator, known by the workflow instance) to a MinIO/S3 ObjectStorage bucket (also known by the workflow).
IOW, the workflow instance "simply" drives the orchestration between the generator, 'transfer-service' and consumers but never carries actual data (for the very reasons you listed).

The 'transfer-service' is quite simple (given a source URL, download the zip file locally then push it to MinIO/S3) but could require "double" authentication (1. access to the transfer-service' endpoint and 2. access to the file' source endpoint) so the workflow definition supports this awesome nested authentication feature $authorization.parameter:

...
            - transferContentToLdsS3:
                call: openapi
                with:
                  document:
                    endpoint:
                      uri: ${ "\( $context.environmentVariables.S3_MANAGER_URL )/openapi.json" }
                  operationId: s3-manager
                  parameters:
                    body:
                      transfers:
                      - bucketName: ${ if $language != "ENU" then (($context.form.qualifiedName | split(" ") | join("-") | ascii_downcase) + "-" + ($language | ascii_downcase)) else ($context.form.qualifiedName | split(" ") | join("-") | ascii_downcase) end}
                        objectName: SVN.zip
                        makeBucket: true
                        unzip: false
                        source:
                          url: ${ "\( $context.form.generator.baseUrl )/api/file/download/package/\( $context.packages[] | select(.name? // "" | contains($package)) | .[$language]? // {}  | .fileId // "" )" }
                          headers:
                          - name: Authorization
                            value: ${ "Bearer \( $authorization.parameter )" }   # <<<<< Authenticate on the Generator's API!
                  authentication:
                    use: generator-oauth2  # <<<< Authenticate on the s3-manager (aka transfer-service) API!
...

0 replies

ericlacher · 2025-07-14T11:14:33Z

ericlacher
Jul 14, 2025
Author

Thanks @bvandewe, so it seems you're using an API to store all data consistently. In my world, that'd e.g. be a GCS bucket.
I am now trying to find out, how to securely authenticate against the GCP API.
I understand there's the secrets feature, which allows me to specify and mount secrets on the runner, then use it in the flow. However, it seems I cannot access the secret value directly (e.g. in a run python script step), and instead need to reference it in an OpenAPI or HTTP step.
Since I had planned to use python scripts needing GCP access, I'd rather not rely on http/OpenAPI steps.

Is there any way I could access the value of a secret? (e.g. tried ${ secrets.foo } which does not work).
Also, I am not sure how to define the secrets in the mounted directory. Looking at the code, it seems the filename is the secret name, and whatever data is in the file will be the secret's value, but not sure.

1 reply

bvandewe Jul 14, 2025

Wait - i'm bit confused by what you're trying to achieve between the workflow runtime and the file-uploader.

Based on your description, a secure and flexible approach is to introduce a dedicated service for handling GCP file uploads, rather than having your workflow directly authenticate with GCP. Let's call this service the "GCP File Uploader."

Workflow's Role:
Your workflow instance (the "runner") wouldn't directly interact with GCP object storage. Instead, it would communicate with the GCP File Uploader service. The workflow simply needs access credentials to interact with the GCP File Uploader's API. These credentials would be stored in a Kubernetes Secret, for example, my-gcp-file-uploader.secret.json.
GCP File Uploader Service's Role:
This dedicated service would encapsulate all the logic and access credentials required to transfer files to GCP object storage (like Google Cloud Storage).
Authentication to GCP: When you deploy the GCP File Uploader service, you'd configure it to read its own Kubernetes Secret, perhaps named gcp-api-credentials.json. This secret would contain the actual GCP API credentials (e.g., a service account key).
API Security: The GCP File Uploader service would protect its own API endpoints (like "POST TransferAllPicturesFromSourceUrlToMyCustomGcpAccount") with its own authentication credentials as a server - expecting clients to provide that when calling its endpoint(s) - my-gcp-file-uploader.secret.json. This is distinct from the GCP credentials that it uses internally as a client (i.e. gcp-api-credentials.json).

In both cases, these credentials are deployed as K8s' secrets - but only one is needed by the workflow' instances (and rely on Synapse' features to read secret and define authentication method for workflow' tasks); and only one is needed by the gcp-file-uploader (can be configured as a ENV_VAR, or mounted as a separate volume... depending on your python implementation and k8s' deployment prefs).

ericlacher · 2025-07-15T10:40:08Z

ericlacher
Jul 15, 2025
Author

thanks @bvandewe

let me clarify a bit: Our 1st workflow should

receive GCS image references (links)
download images
send them to an analysis API
generate reports based on results (PDF/CSV,HTML/...)
store reports on GCS

My initial idea was to do it this way:

download images: script:python step (authenticated against GCP/GCS)
send image data to analysis API: OpenAPI step (issue: binary data in request is not supported afaik)
generate reports: script:python
store reports on GCS: script:python (similar to download)

I have a PoC python script for accessing images on GCS (authenticating with a key-file mounted via secrets directory), but I think you are proposing a completely different approach, which is outsourcing heavy-lifting to a deployed service, which in turn is being called from serverless workflows.
So, if I get you right, you would do it like this:

call a deployed service offering an API which downloads the images, sends them to the analysis API and returns the results
call a deployed service offering an API, which generates the reports (based on passed results) and persists them on defined locations

Is this correctly summarizing your idea?

5 replies

cdavernas Jul 21, 2025
Maintainer

I strongly advise against embedding files—whether as base64 or any other format—directly in the workflow data. Doing so will bloat and slow down every operation, as the files will be passed along as input/output across potentially all tasks and iterations.

We’ve had extensive discussions in the specification about how to handle file data properly, but we’ve yet to find a universally acceptable solution. The challenge lies in ensuring consistent support across all target technologies.

Your initial idea is solid—scripts can definitely handle the heavy lifting. However, you'll need a mechanism to persist files across the various tasks in your workflow. One approach is to use a third-party service like MinIO for file storage. Alternatively, you could configure your operator’s runner container to mount a read/write volume, allowing scripts to output and access files consistently throughout the workflow.

cdavernas Jul 21, 2025
Maintainer

So, if I get you right, you would do it like this:

call a deployed service offering an API which downloads the images, sends them to the analysis API and returns the results

call a deployed service offering an API, which generates the reports (based on passed results) and persists them on defined locations

This obviously works, too.

ericlacher Jul 22, 2025
Author

I strongly advise against embedding files—whether as base64 or any other format—directly in the workflow data. Doing so will bloat and slow down every operation, as the files will be passed along as input/output across potentially all tasks and iterations.

We’ve had extensive discussions in the specification about how to handle file data properly, but we’ve yet to find a universally acceptable solution. The challenge lies in ensuring consistent support across all target technologies.

Your initial idea is solid—scripts can definitely handle the heavy lifting. However, you'll need a mechanism to persist files across the various tasks in your workflow. One approach is to use a third-party service like MinIO for file storage. Alternatively, you could configure your operator’s runner container to mount a read/write volume, allowing scripts to output and access files consistently throughout the workflow.

Having blob data as part of the spec would be great from my PoV. Is there a resource I could look up the discussions you had on that topic? Couldn't find anything in issues.discussions on the spec repo.

Ideally, the spec would define how the data can be set/read and the runtime could deal with the specifics (e.g. allowing to specify a backend (S3, local directory, ...) which should be used).
Perhaps that would also allow using mutlipart requests directly using the persisted data (afaik multipart does not work atm).
Tricky part would be to get the binary data into a python script without bloating the events.
Is it possible to supply data via stdin to the scripts?

I guess in my world everything is a bit too simple and you guys know the actual challenges way better, but my hope would be to get this solved at a foundational level (aka the spec) at some point.

Thanks again for all your work!

cdavernas Jul 22, 2025
Maintainer

@JBBianchi @ricardozanini Can't find the issues/discussions either. Was it on Slack and/or weeklies?

Ideally, the spec would define how the data can be set/read and the runtime could deal with the specifics (e.g. allowing to specify a backend (S3, local directory, ...) which should be used).

Absolutely agree. File handling is, in my view, a crucial piece—especially for ETL pipelines. Would you be open to starting a discussion on the spec repo to surface your ideas and suggestions more formally?

Tricky part would be to get the binary data into a python script without bloating the events.
Is it possible to supply data via stdin to the scripts?

Funny you ask—we’ve been discussing exactly this with @hirenr on Slack. I’m optimistic we’ll land on a clean solution for this in SW v1.1.x.

I guess in my world everything is a bit too simple and you guys know the actual challenges way better, but my hope would be to get this solved at a foundational level (aka the spec) at some point.

Simplicity is a virtue! The real challenge is in certain execution models—like using separate lambdas for each task—where volume mounting isn't viable. That’s why we’re exploring more generalizable options. Short of introducing a dedicated file-management layer (e.g., Minio, as you suggested), there aren’t many solutions that are truly cross-compatible, simple, and elegant. But we're thinking along exactly those lines.

ericlacher Jul 22, 2025
Author

Would you be open to starting a discussion on the spec repo to surface your ideas and suggestions more formally?

sure, will happily do, but this will take some time.

ericlacher · 2025-07-16T08:53:39Z

ericlacher
Jul 16, 2025
Author

on a sidenote: as mentioned I have a PoC for my initial flow using GCS client in a python script. When adding the base64 representation of the image data (51kb raw image size, potentially larger when base64 encoded) to the output (stdout), the workflow does never finish (runner keeps running forever). When removing that particular field from the response, the workflow finishes successfully.
Is there a explicit (or implicit) limit for event/response sizes, or is this related to e.g. flushing the output?

I do understand this approach is not the way to go, but I'd like to better understand any limits applying.

1 reply

cdavernas Jul 21, 2025
Maintainer

@ericlacher AFAIK, there's no limit in response size, but there are regarding request URI size. So, unless you are passing said base64 string as a query or path parameter of a given HTTP request, I'd need more detail/logs/context on the issue you are facing to be able to help you.

ericlacher · 2025-07-22T14:37:15Z

ericlacher
Jul 22, 2025
Author

Alternatively, you could configure your operator’s runner container to mount a read/write volume, allowing scripts to output and access files consistently throughout the workflow.

@cdavernas I actually tried to do this, and at least locally in docker, I am not able to specify a bind mount in the runner template.
What I tried was adding this to my runner template:

Image: ...
Volumes:
    some-local/dir: /test

I found this code (and definition here?), which seems to attempt to extract volumes and add them to binds. However, somehow this does not work for me (container won't start with Exception during deserialization in WorkflowController.cs:line 39)

8 replies

cdavernas Jul 22, 2025
Maintainer

I have created #521 to surface the limitation, which has been addressed by #522!

You can now set volume mounts by configuring operators as follow:

apiVersion: synapse.io/v1
kind: Operator
metadata:
  name: operator-1
  namespace: default
  creationTimestamp: 2025-01-30T15:23:45.9053142+00:00
  generation: 9
  resourceVersion: BA210939
spec:
  runner:
    api:
      uri: http://api:8080
    runtime:
      docker:
        api:
          endpoint: unix:/var/run/docker.sock
        imagePullPolicy: Always
        containerTemplate:
          image: ghcr.io/serverlessworkflow/synapse/runner
        hostConfig:  #new property, used to configure the container's host, amongst which (but not restricted to) the volume mounts
          mounts:
          - type: bind
            source: /my-source
            target: /my-target
        secrets:
          directory: /my-secret-source
          mountPath: /run/secrets/synapse
        network: synapse
    containerPlatform: docker
    publishLifecycleEvents: true

In above example, if the runner saves files to the container's /my-target path, they will appear in the mounted path /my-source, which can therefore be used to share files across instances.

You can also configure volumes when defining a container task, which should now properly configure the resulting container's binds, which will therefore allow you to share files across tasks.

ericlacher Jul 22, 2025
Author

Thanks a lot @cdavernas for the fast fix, highly appreciated!
Would it be feasible to introduce a similar concept like SYNAPSE_RUNTIME_DOCKER_CONTAINER, so we can configure the hostConfig via a file? (e.g. SYNAPSE_RUNTIME_DOCKER_HOSTCONFIG)

cdavernas Jul 22, 2025
Maintainer

Sure, great idea! Would you be so kind to open an issue to surface your feature request?

ericlacher Jul 22, 2025
Author

sure, will do

ericlacher Jul 22, 2025
Author

@cdavernas here you go: #524

GarNet / Redis / Other Storage Options #510

Uh oh!

ericlacher Jul 14, 2025

Replies: 5 comments · 15 replies

Uh oh!

bvandewe Jul 14, 2025

Uh oh!

Uh oh!

ericlacher Jul 14, 2025 Author

Uh oh!

Uh oh!

bvandewe Jul 14, 2025

Uh oh!

ericlacher Jul 15, 2025 Author

Uh oh!

cdavernas Jul 21, 2025 Maintainer

Uh oh!

cdavernas Jul 21, 2025 Maintainer

Uh oh!

ericlacher Jul 22, 2025 Author

Uh oh!

cdavernas Jul 22, 2025 Maintainer

Uh oh!

ericlacher Jul 22, 2025 Author

Uh oh!

ericlacher Jul 16, 2025 Author

Uh oh!

cdavernas Jul 21, 2025 Maintainer

Uh oh!

ericlacher Jul 22, 2025 Author

Uh oh!

Uh oh!

cdavernas Jul 22, 2025 Maintainer

Uh oh!

ericlacher Jul 22, 2025 Author

Uh oh!

cdavernas Jul 22, 2025 Maintainer

Uh oh!

ericlacher Jul 22, 2025 Author

Uh oh!

ericlacher Jul 22, 2025 Author

ericlacher
Jul 14, 2025

Replies: 5 comments 15 replies

bvandewe
Jul 14, 2025

ericlacher
Jul 14, 2025
Author

ericlacher
Jul 15, 2025
Author

cdavernas Jul 21, 2025
Maintainer

cdavernas Jul 21, 2025
Maintainer

ericlacher Jul 22, 2025
Author

cdavernas Jul 22, 2025
Maintainer

ericlacher Jul 22, 2025
Author

ericlacher
Jul 16, 2025
Author

cdavernas Jul 21, 2025
Maintainer

ericlacher
Jul 22, 2025
Author

cdavernas Jul 22, 2025
Maintainer

ericlacher Jul 22, 2025
Author

cdavernas Jul 22, 2025
Maintainer

ericlacher Jul 22, 2025
Author

ericlacher Jul 22, 2025
Author