Speed up build_ellipse_model #2046
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Like #1288 (which seems abandoned), this speeds up build_ellipse_model to address #1168 . However, it is entirely written in Cython and does not require a separate shared library. In local testing on Python 3.12, it's about 130x faster than the current pure Python implementation. I also made the tolerance for one of the unit tests much stricter, because I found that it was passing even when the output was incorrect.
There is a potentially premature optimization to specialize the function depending on whether harmonic arrays are passed or not. The C code generated on my system looked correct, but I didn't actually find any difference in performance between
high_harmonics=False
and True.I also attempted to parallelize it with Cython's
prange
. However, in my testing on multiple Linux systems, I found it never actually used more than one thread. Additionally, I couldn't figure out how to allow for anum_threads=None
value in the Cythonbuild_ellipse_model_c
function, since it's typed as an int. Thus, I'm making this a draft PR to start. If someone more versed in Cython/prange could have a look at that, it would be helpful. Otherwise, the parallel support could simply be dropped.