-
Notifications
You must be signed in to change notification settings - Fork 77
Initial numba module #3225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Initial numba module #3225
Conversation
I've done quite a bit of |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3225 +/- ##
==========================================
- Coverage 89.61% 89.24% -0.37%
==========================================
Files 28 29 +1
Lines 31983 32111 +128
Branches 5888 5898 +10
==========================================
- Hits 28660 28656 -4
- Misses 1888 2018 +130
- Partials 1435 1437 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Having some CI weirdness that I'm not yet able to recreate. |
CI Fixed. Here's some benchmarking with the "coalescent_nodes" method from #2778 on a TS with 12M edges: Using |
Shall we move the first commit into its own PR? It's cluttering up this one and making it hard to see the real changes. |
I had imagined something lower level that was basically a copy of the TreePosition class from here: https://github.com/jeromekelleher/sc2ts/blob/7758245c3dc537aeec3b7cd6282241b65f8843dd/sc2ts/jit.py#L107 So, we don't try to provide Pythonic APIs, but just provide direct access to the edges out and edges in, which can be numba compiled like the example in the sc2ts code. |
That's how this code works, while tree_pos.next():
for j in range(tree_pos.out_range[0], tree_pos.out_range[1]):
e = tree_pos.edge_removal_order[j]
c = edges_child[e]
p = edges_parent[e]
parent[c] = -1
u = p
while u != -1:
num_samples[u] -= num_samples[c]
u = parent[u] becomes for tree_pos in numba_ts.edge_diffs():
for j in range(*tree_pos.edges_out_index_range):
e = numba_ts.indexes_edge_removal_order[j]
c = edges_child[e]
p = edges_parent[e]
parent[c] = -1
u = p
while u != -1:
num_samples[u] -= num_samples[c]
u = parent[u] It is still compiled, and 30% faster (for the coalesent nodes example)! |
Ahh, I didn't spot that sorry. How is it faster then? I do think we should just stick with the TreePosition interface though, because we want to support seeking backwards as well, and ultimately randomly. There's no point in adding a layer for indirection on top of that. |
Mutating numpy arrays to maintain the state involves the following:
Whereas yielding lightweight immuatable objects is much more amenable to numba optimisation. We might be able to get the same gains by using native objects for the state rather than numpy arrays if you are set against iteration. |
Let's talk it through in person - I don't have time to form an educated opinion I'm afraid. |
7aa7151
to
5d22c6c
Compare
I've tried to closely match the exisiting tsutil implementation with |
New code looks just as fast, proceeding to add some more tests. Will merge this then before doign docs. |
Getting some weird failures on Windows here, and coverage not counting for the new module, will fix. I've added a stab at some docs. |
Re docs, eventually we probably want a "high performance" tutorial with some of this stuff, but I can have a stab at that after 1.0. There's some comments here: tskit-dev/tutorials#150 (comment) and some code examples at tskit-dev/tutorials#63 |
docs/numba.md
Outdated
print(type(numba_ts)) | ||
``` | ||
|
||
## Tree Traversal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I normally think of "tree traversal" as iterating through the tree structure itself. Do you mean "Iterating through trees" here? I can't see any pre/postorder traversal code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I was avoiding the word "iteration" not not confuse it with a Python iterator - but I'll change it back as this is more confusing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just "moving between trees"?
Part of #3135