Skip to content

Get columnar example to non-realloc'ing state #646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

frankmcsherry
Copy link
Member

Changes made to timely and examples/columnar.rs so that the example can run indefinitely without calling realloc in steady state. Concretely,

sudo dtrace -n 'pid$target::realloc:entry { @ = quantize(arg1); }' -c 'target/debug/examples/columnar'

results in (after however long you like)

...
seen: WordCountReference { text: "flat", diff: 24367 }
seen: WordCountReference { text: "flat", diff: 24368 }
seen: WordCountReference { text: "flat", diff: 24369 }
seen: WordCountReference { text: "container", diff: 24367 }
seen: WordCountReference { text: "container", diff: 24368 }
seen: WordCountReference { text: "container", diff: 24369 }
^C


           value  ------------- Distribution ------------- count    
               4 |                                         0        
               8 |@                                        3        
              16 |@@@@@                                    14       
              32 |@@                                       4        
              64 |@@@@@@@                                  18       
             128 |@@@@@@@@@@                               25       
             256 |@@@@@@@                                  19       
             512 |@@@@@                                    12       
            1024 |@@@                                      7        
            2048 |                                         0        

mcsherry@gallustrate timely-dataflow %

The changes are not all great. In particular, the changes to thread::Puller feeds allocations back to the pusher from which it received them, which has the potential to leak some amount of memory, if the pusher is no longer active. But if you don't have this, the pipeline channel is a moment where allocations leave the system (the recipient has nothing it can do but drop them).

The other changes seem pretty good, but worth discussing. Mostly just a bit more care around exactly when we overwrite containers, and restricting that to moments where we are more confident that we aren't electing to lose allocated containers. Worth understanding, as "fixing these bugs" has the potential to look like having thread::Puller return containers into a length-one buffer: we sit on a bit more memory than before, although we probably intended to do so.

One thought was that containers have extract and finish, but they don't have an activate, or some other signal to indicate that now is a good time to prep a buffer from storage. The CapacityContainerBuilder for example has a current and empty, both of which sit on resources and can't perfectly navigate the moments at which they should swap. E.g. once full current goes in to pending and is refilled from empty, and eventually pending drains in to empty and is revealed, but just after that moment you'd like to move empty to current, which we cannot easily do at the moment.

@antiguru antiguru self-requested a review February 16, 2025 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant