Guidance for users on what is and is not appropriate for the KVS #5784

jameshcorbett · 2024-03-11T17:28:47Z

jameshcorbett
Mar 11, 2024
Maintainer

In the past couple of months I've had two users come to me for help with using the KVS to store their job data. One of them mentioned their data would be around ~1Gb per job.

Many years ago when I was working on a workflow manager, when Dong told me that Flux had a KVS, I wanted to use it to store my data. It's probably a common reaction--hear there's a KVS and want to use it to solve all your data problems.

It doesn't seem like we have documentation describing what sorts of use-cases are and are not appropriate for the KVS. I was hoping we could figure that out in this discussion and then I could maybe put some documentation up and link to it from the man pages on the CLI and Python bindings.

garlick · 2024-03-11T17:50:46Z

garlick
Mar 11, 2024
Maintainer

That is an excellent suggestion. I had planned to add some KVS docs to the flux-core readthedocs pages but hadn't settled on what that shoud look like. There is an old wiki document that i'd like to upcycle over here but its more for kvs internals.

Some random considerations:

The default location for a batch job kvs is /tmp which may be a ramdisk with limited space but see the statedir broker attribute.
By default a batch job's kvs content is not preserved but see flux batch --dump
We have a place for user defined job data defined in RFC 16
The KVS back end storage requirements go way up if there is significant "churn" in content, so rewriting a key many times during exection might be considered an antipattern.
Another antipattern is creating keys with huge values since fetching such a key can cause head of line blocking in the broker.
flux-archive may be helpful in some use cases, especially where the data becomes input to subsequent jobs because then the stage-in job shell plugin can be used.

Anyway just a few quick thoughts to get things started.

0 replies

wihobbs · 2025-05-07T15:00:44Z

wihobbs
May 7, 2025
Maintainer

Hey @jameshcorbett, @garlick,

I'm working through our tutorial content now and I think some docs as suggested above would be excellent to link to. (We might have a subsection of the tutorial devoted to the kvs python API.) Have we figured out the patterns and anti-patterns enough to start on a docs page? Happy to help out here.

0 replies

garlick · 2025-05-07T15:56:59Z

garlick
May 7, 2025
Maintainer

Maybe we could put up a page in the flux-core docs that just has the above suggestions and links for now. You could reference that from the tutorial and we could improve it as time permits?

Edit: It seems like we need a top level section in the core docs to contain this but I'm not sure what that should be. Maybe something to do with developing workflows? (where workflow could be loosely defined as the logic that drives the work that runs within a Flux sub-instance/batch job? Heh maybe that needs a glossary entry.)

0 replies

chu11 · 2025-05-07T17:29:08Z

chu11
May 7, 2025
Maintainer

As an aside, see issue #6266 (being worked on right now) ... generally speaking there is some cap on KVS sizes for performance reasons and users should store to Lustre, etc. instead. I don't know if some type of "YMMV" on maximum data sizes, but if you're getting into the ..... 10s of megs? parallel file system is probably the way to go?

0 replies

garlick · 2025-05-07T17:46:29Z

garlick
May 7, 2025
Maintainer

I'll put up a PR real quick and we can discuss the details there.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Guidance for users on what is and is not appropriate for the KVS #5784

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Guidance for users on what is and is not appropriate for the KVS #5784

Uh oh!

jameshcorbett Mar 11, 2024 Maintainer

Replies: 5 comments

Uh oh!

garlick Mar 11, 2024 Maintainer

Uh oh!

wihobbs May 7, 2025 Maintainer

Uh oh!

Uh oh!

garlick May 7, 2025 Maintainer

Uh oh!

chu11 May 7, 2025 Maintainer

Uh oh!

garlick May 7, 2025 Maintainer

jameshcorbett
Mar 11, 2024
Maintainer

garlick
Mar 11, 2024
Maintainer

wihobbs
May 7, 2025
Maintainer

garlick
May 7, 2025
Maintainer

chu11
May 7, 2025
Maintainer

garlick
May 7, 2025
Maintainer