-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a reference #3
Comments
Thank you for the kind words. Added! I was just about to add Neural CDEs and some other Neural ODE papers from the recent flood of interesting arXiv preprints. I would also like to extend an invite to you and your collaborators to try out the torchdyn library, and perhaps contribute if you find it useful. Neural CDEs are already in the planned feature roadmap, so it'd be helpful to receive your input on where you'd feel they best fit in the library. |
Thankyou! Thanks also for the invite. We've also been thinking about the best way to make NCDEs work as a library, as there's a bunch of edge cases (of course!) that need handling correctly. This is something we're still figuring out, but once we're there I'd already planned to get in touch re: torchdyn. |
@Zymrael Coming back around to this now with the release of torchcde. You might find it interesting to have a look at what we're doing there with respect to torchdyn. The key components are:
The actual implementation is all very straighforward. In large part |
@patrick-kidger Thanks for sharing; the tutorials are well made (though I tend to prefer notebooks than scripts) and I enjoy your commenting style. On our end, we are planning an early October release with a lot a focus on Neural SDEs, sequence / latent variants and some utilities for control. I have a baby Neural CDE version somewhere in my WIP folders and was just recently thinking about whether I should push it to a usable state for torchdyn. Any particular reason for an entirely separate package e.g are you planning to expand CDEs in future research with more interpolation styles (or perhaps using some other learning technique to obtain It'd be nice to receive your feedback on the direction you think will be best for us as a community. I see us going for either (1) a "shattered" but highly interconnected collection of highly specialized libraries SciML-style, where we'd have a group of active maintainers working on different packages (such as yourself with your cool work on We'd enjoy chatting about |
This post is a bit of a behemoth. torchcde Thanks, I'm glad you like the tutorials. In terms of having a separate package - for myself, this is mostly intended as a convenience. For example I'd like to have a For others, I'm hoping it serves as an easy way to use NCDEs without having to worry too much about the details of what a Stieltjes integral actually is, and as a place to clearly document "please remember to add time as a channel" and "irregular data is easy". Looking to the future, there's quite a lot of CDE theory and I wouldn't be surprised if some of it ends up making sense to implement in torchcde at some point. (You can already see this with the logsignature stuff - c.f. the bottom of the README - paper on that to appear shortly.) There's definitely a lot still to be done with NCDEs. ecosystem Haha, I wish there were 10 Patricks - my papers might already be written! I think my preference would be to put everything in one spot. The reason for that is that it would be nice to have diffeq-related things in together, akin to I've not spoken to Ricky about this (he may well feel differently!), but if I had a magic wand I think I'd put everything into, and hugely expand, Practically speaking I'm not expecting anything like that to happen quickly. If nothing else I think in this model, I'd look to follow the torch/torchvision pattern. Have On interpolation schemes in particular, I have pondered splitting that out into a Side note: I don't think a comparison to the shattered approach of SciML is actually fair, because even there they have DifferentialEquations.jl, which puts basically all the differential equation stuff in one place. The things that get spun out into separate libraries seem to be non-diff-eq things, like quadrature or banded matrices or what have you. implications What all of that actually boils down, at least in the short term:
|
A behemoth of a post, but an interesting one :D. I'll provide my own beastly wall-of-text below: ecosystem: As you can imagine, however, I do not agree regarding intermediate Something interesting about Julia and SciML: DifferentialEquations.jl, does indeed glue together a lot of methods, but they maintain a separate DiffEqFlux (torchdyn-like) library and more, such as DiffEqOperators.jl, DiffEqBase.jl, StochasticDiff.jl.
Three big components of their ecosystem are the specific solver packages, implications torchcde
It turns out for our specific needs a Good luck with your ICLR submissions, looking forward to seeing what you've been up to in the past few months. We'll get back in touch to organize a Zoom meeting after the ICLR storm passes! |
Haha - alright, here's a much shorter response. ecosystem: Yeah, I guessed you might not regarding the Ah, you clearly know a bit more about SciML than I do. I've not used it, but everything about it looks pretty good. If it wasn't for the fact that the ML community has standardised on Python then I'd probably make the switch to that myself. implications: I'm guessing you mean "non-PyTorch". ;) Yeah, JAX has much better autodiff than PyTorch. This is something we've been running into in On the flip side - and I've not experimented with this at all - unlike JAX, I think PyTorch supports task-based parallelism via torchcde: Interesting; do you use the convention that time is always the last channel of the state? That touches on a bit of a subtlety to this whole procedure (one of the things I won't try and get into that now though. What you're proposing makes sense to me. Anyway, let's pick all this up whenever we have that meeting. (PS: since you mention it - if you don't already know, |
@patrick-kidger @ChrisRackauckas
I think https://github.com/SciML/DiffEqGPU.jl has a dedicated solver for each thread, although I'm not sure. I believe that they are still working on getting the autodiff working for it though: SciML/DiffEqGPU.jl#72 Btw I have been trying to do an implementation of the torchcde method in DiffEqFlux: SciML/DiffEqFlux.jl#408 I was able to get a bit of a speedup (~5.5x) over the PyTorch version by using DiffEqFlux, but I think the main way to get any other major speedups would be having the dedicated solvers for each thread. |
https://diffeqflux.sciml.ai/dev/examples/optimization_sde/ is a tutorial that demonstrates task-based parallelism on SDEs. You can run that on a cluster too just by adding Note that with the task-based parallelism you can assign a GPU per thread, and it should "just work". https://juliagpu.gitlab.io/CUDA.jl/usage/multigpu/ describes how to do that, so you could use EnsembleThreads + have each thread have a separate GPU to locally do GPU-per-ODE training (instead of GPU ensembling), which I think is more appropriate for the CDE. I haven't tried it, but it should work and would be a cool tutorial demo. Also while I'm here, there's been a lot of movement in the interop areas. GPU support now works on R through ModelingToolkit being used as a JIT: https://www.stochasticlifestyle.com/gpu-accelerated-ode-solving-in-r-with-julia-the-language-of-libraries/, and that means adjoints should work. Someone should test adjoints through MTK-complied R code. That almost means neural ODEs in R can be trained with Julia directly without writing any Julia code... but... MTK currently scalarizes operations (because of it's link to Modelica symbolic compiler systems) but that's an issue we're going to overcome by November (linked to a closed project that will require it). So... neural ODEs in R are almost done. And the R version will automatically install Julia in the background too: JuliaInterop/JuliaCall#135 . So R should have very good SciML support hopefully by November. The reason why we did R first was ModelingToolkit on Python hit a few issues, which the devs are going to help me solve. Then we should be able to do similar demos from Python as well, with similar limitations, but supporting something as straightforwardly defined as a neural ODE should be fine (once non-scalarizing support is completed). |
Hello - thanks for this resource! I find it really helpful.
I've been meaning to ask if it would be acceptable to add a reference to our recent paper https://arxiv.org/abs/2005.08926. I'm happy to open a pull request adding this if you like.
The text was updated successfully, but these errors were encountered: