-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
computational performance #7
Comments
offline discussion, reply from Mikael:
|
Here is a concrete example from a singler layer shallow water solver. Computation of the RHS of equations of motions is performed with this part of the code:
where:
Does this look like |
Hi
You can profile the program quite easily using line_profiler. Install it and add decorator |
Thanks Mikael, that is incredibly useful. Here is an excerpt from the line_profiler output for the original version:
and for the optimized version:
This is quite a difference indeed ! |
Hi @apatlpo In case you're interested in squeezing another few percent in performance. The next step is to use either cython, numba, weave, f2py or something similar on the term
And for me this is about 50 % faster than |
Thanks again Mikael this is again very helpful ! |
Don't think anything is messed up. It's just the fact that the optimized term is no more than 5-10 % of the total computation time, so the effect of the optimization is not sp large. If you isolate |
Ok, thanks, I get it now. I've updated the links above with better isolation of the optimized sections. |
I have not seen either in the shenfun paper or on spectralDNS/shenfun
guidelines regarding large scale applications, e.g: what operations are costly at the end?
what variables hold data at the end?
Note that I realize your paper may have what is required to answer these questions.
Yet after reading the paper, the answers are not straightforward for me.
It's maybe just an issue of form, here are some suggestions about form:
In my case for example, I am not familiar with spectral methods and their bottlenecks.
I am also always nervous about hidden python overheads or subtleties that may impact performance.
At the stage I'm at, i.e. planning the development of code based on shenfun that will ultimately run
at hpc scale, this seems important.
The text was updated successfully, but these errors were encountered: