-
-
Notifications
You must be signed in to change notification settings - Fork 272
StateSpace Module Hurricane Case Study ~~WIP~~ Ready for Review #763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
View / edit / reply to this conversation on ReviewNB fonnesbeck commented on 2025-01-05T01:33:47Z Not sure if its my setup, but there are some stray $ characters in the intro. Dekermanjian commented on 2025-01-05T11:37:14Z Hi @fonnesneck! Thank you for taking the time to review this notebook. On ReviewNB it seems that the double $ that should generate an equation on its own line doesn't render, but it does on the PyMC examples page. That is why there are stray $ characters. |
View / edit / reply to this conversation on ReviewNB fonnesbeck commented on 2025-01-05T01:33:48Z We don't currently use plotly in any of the other examples, so need to update the requirements. Dekermanjian commented on 2025-01-05T11:41:24Z Yes, you are correct. I will move the comment on line 6 to the very top of this cell. I have Plotly included in the extra_installs in the myst.md file. Again this does not render in ReviewNB but it does on the PyMC examples website. You can see it if you visit the documentation preview https://pymcio--763.org.readthedocs.build/projects/examples/en/763/case_studies/ssm_hurricane_tracking.html |
View / edit / reply to this conversation on ReviewNB fonnesbeck commented on 2025-01-05T01:33:49Z spelling: "originate"
Also, maybe worth adding a sentence describing what defines an origination point. Dekermanjian commented on 2025-01-05T11:43:24Z Thank you for catching that! Yes, I agree with you. I will add some more context on what constitutes an origination point. |
Hi @fonnesneck! Thank you for taking the time to review this notebook. On ReviewNB it seems that the double $ that should generate an equation on its own line doesn't render, but it does on the PyMC examples page. That is why there are stray $ characters. View entire conversation on ReviewNB |
Yes, you are correct. I will move the comment on line 6 to the very top of this cell. I have Plotly included in the extra_installs in the myst.md file. Again this does not render in ReviewNB but it does on the PyMC examples website. You can see it if you visit the documentation preview https://pymcio--763.org.readthedocs.build/projects/examples/en/763/case_studies/ssm_hurricane_tracking.html View entire conversation on ReviewNB |
Thank you for catching that! Yes, I agree with you. I will add some more context on what constitutes an origination point. View entire conversation on ReviewNB |
2. Improved 24-hour ahead forecasts of simplest model 3. Added a section and model with how to add exogenous variables to the model
2. cleaned up typos and figure sizes/zoom
2. Added a non-linear B-Splines section
2. Added closing remarks section 3. Updated references bib file
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:25Z Some nitpicks on the latex:
Dekermanjian commented on 2025-04-03T23:34:44Z Totally agree with all of the above. I will clean up the equations with latex and make a note on the Joseph form of the filtered covariance equation. I will also look up what the convention is for the t subscript and replace it everywhere in the notebook that occurs. Dekermanjian commented on 2025-04-06T21:46:55Z Hey Jesse, I think maybe I misunderstood one of your comments. The letter "t" itself isn't the issue in your comment but the fact that I have it subscript t within the text under the definition column of the above tables. Is that correct? jessegrabowski commented on 2025-06-14T12:26:01Z Yes that's correct
|
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:26Z Line #3. Generates a 95% CI ellipse via a chi-square multivariate normal approximation Nit: we use numpy docstring formatting in the pymc codebase, might as well use it here too Dekermanjian commented on 2025-04-03T23:35:39Z Thank you for making me aware of the numpy formatting. I will make the change everywhere this applies. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:27Z Line #20. ): # First handful of ellipses are huge because of little data in the iterative nature of the Kalman filter
I'm reviewing this sequentially so haven't gotten to the model yet, but this might also be because the initial P0 matrix is too diffuse. Dekermanjian commented on 2025-04-03T23:37:51Z I didn't think of that. I will make a note of it in the text and I will play around with different P0 initializations. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:27Z Line #21. # Hack that should be fixed in StateSpace soon Was this ever fixed? I should do it immediately if not. Dekermanjian commented on 2025-04-03T23:39:29Z I will test it out with the latest versions of pymc + pymc-extras and report back if it isn't. I recall this being fixed, though. Dekermanjian commented on 2025-04-06T14:14:22Z Hey Jesse, the GitHub issue for this issue is still open pymc-devs/pymc-extras#424 and I can confirm that the hack is still required on the latest version of pymc-extras. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:28Z I guess the points are colored by category? Is it possible to add a legend for that? (This plot is super slick btw) Dekermanjian commented on 2025-04-03T23:41:24Z Yes, that is correct. I agree that a legend should be included. I will add one! |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:29Z In this example we assume constant acceleration (In order to keep our system of equations linear) I don't think constant acceleration is required for linearity in this setup -- you could have a stochastic innovation on the acceleration term that allows it to change over time, and indeed you do! I guess this is a technical point required to discretize the Newtonian ODEs, but it's also not technically true in your model. Make of that what you will.
Our states are the following components derived from the Newtonian equations of motion Not required, but you could show how these matrices arise from integrating the ODEs p_dot(t) = v(t), v_dot(t) = a(t), a_dot(t) = 0 over an interval delta t.
In this example we fix our observation/measurement error to a small value (0.1) reflecting our confidence in the measurements. The latex matrix H has 0.5 on the diagonal, not 0.1. Maybe also make a note about the units of the data, and what the number 0.1 actually represents in terms of measurement error.
variane (diagonals) covariance (off-diagonals) of the Newtonian equations Typo: variance
Dekermanjian commented on 2025-04-04T00:04:13Z Yes, you are correct. I think I need to clarify that with this model we aren't able to model any angular acceleration whenever the hurricane makes maneuvers so we aim to capture turns with the stochastic innovation on the acceleration term.
I will put in the ODE integrations. I think this will be good for completeness. Of course, I will fix the mismatch between the latex and the text and the typo. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:30Z Line #13. simple_idata = pm.sample(nuts_sampler="numpyro", target_accept=0.95) Try with nutpie, you might be able to get away with not increasing target accept. Dekermanjian commented on 2025-04-04T00:05:03Z Yes, I will give it a go. I believe at the time the Jax backends were not implemented yet. I will give that a try too. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:31Z It uses data from the entire time series in the sense that your estimated parameters already saw the entire future.
But all it does is re-write your model so that Dekermanjian commented on 2025-04-04T00:06:04Z Thank you so much for clarifying this for me! jessegrabowski commented on 2025-04-04T03:46:16Z It happens here if you're interested:
|
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:32Z The addition to the R matrix imply that the exogenous parameters do not exhibit any process noise. Nit: I wouldn't describe the stochastic innovations in the transition equation as "noise", since they're in some sense "real", as updates to the values of the latent states of the system. I think of "noise" as arising from some flaw in our ability to observe, e.g. from the measurement errors. Dekermanjian commented on 2025-04-04T00:10:53Z I will go back and reword how the stochastic innovations are described in the text. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:32Z Maybe a forest plot here? It's a bit more compact.
You might also have to play with the prior variances on these. N(0, 10) is pretty aggressive, and you can see that it's not like an OLS regression where the priors quickly wash out. All these posteriors have pretty wide uncertainties. I'm not familiar enough with the science to know if an effect size of -20 * wind pressure is realistic, but your priors but non-zero probability mass on that.
Dekermanjian commented on 2025-04-04T00:13:43Z I agree, that priors are poorly specifies. I will go back and pick more sensible priors and I will condense the output with a forest plot. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:33Z To keep things simple, we are going to define a constant number of knots that are equal in both variables (same #knots for both longitude and latitude) and we are going to place the knots using a quantile function over the variable space. You can see the knots plotted out below for each variable. Write "same number of knots" -- currently looks like a hashtag
Dekermanjian commented on 2025-04-04T00:14:56Z Ah yes, I should've been more careful. I will make the change. |
View / edit / reply to this conversation on ReviewNB jessegrabowski commented on 2025-04-03T14:05:34Z Line #201. "exog_dims": [ Could use an iterator + f-string for these names; they're a bit verbose at the moment Dekermanjian commented on 2025-04-04T00:16:14Z Yes, totally agree. I will make the change. |
TL;DR use a single Not completely sure what is going on, I think it is a bug in myst-nb, but I think we can circumvent it in a relatively straightforward way. myst-nb has a fancy html class structure to make sure mathjax processed only the relevant bits of the page instead of the whole page, which should help load times and responsiveness in general. In the other notebooks, these |
Thank you, @OriolAbril! I will try this now! |
I don't think this will work. I haven't gone over the notebook to see where this would be used, but if it is related to a question @jessegrabowski asked about covariance matrices in one of the discords you should be able to do something like: target_variable = idata.posterior["variable_name"] # has dims (chain, draw, time, state)
corr = xr.corr(target_variable, target_variable.rename(state="state_bis"), dim=("chain", "draw"))
# corr has dims (time, state, state_bis) All elements of corr where the coord for state and state_bis match are 1. Other slices (e.g. |
Yeah we realized this was dumb, but it's good to hear you confirm it. We're going to compute it with a pymc model |
Looks like the heading trick didn't work. My next guess is plotly also uses mathjax and the configs of plotly and myst are clashing but I have no idea how we'd fix that. Do you know if it is possible to load plotly without mathjax? |
@OriolAbril, yeah it might be a plotly issue. I remember in earlier versions when the LaTeX was rendering properly the plotly figures were not rendering. At the time I selected a specific plotly renderer (https://plotly.com/python/renderers/) that rendered properly on NBReview so that the reviews could be done. I will check if there are any renderers that don't use mathjax. |
Thanks @OriolAbril, I will give this a try. This would solve an issue that I had originally thought was going to be more complicated than this. |
…ans and covariances. Updated plotly setting to automatically select renderer
Okay, looks like Plotly is the culprit here. Now I am able to get the math to render properly but now the figures won't render. I'm going to try out a few different renderers to see if I can get both to work at the same time. |
There seems to be many issues related to plotly/mathjax issues, both on their own and within sphinx builds. Most of them seem to be closed without any clear fix. There are some suggestions regarding mathjax configuration in some but they might be outdated by now. Some examples: jupyter-book/jupyter-book#1528, quarto-dev/quarto-cli#12060, plotly/plotly.js#4563, plotly/plotly.py#3152 |
It looks like reviewnb managed to render both elements in this case, but in case it helps, if things render on the website but not reviewnb don't hesitate to consider that a fix. ReviewNB has been a constant issue of rendering mishaps (as some other comments in this same PR show), and its "rendered preview" is to be officially ignored by reviewers, only the content and actual preview on RTD matter. |
…state means and covariances. Updated plotly setting to automatically select renderer" This reverts commit 07355d2. Revert using conditional posterior to using save_kalman_filter_outputs_in_idata
Thanks @OriolAbril, I am going to try a handful of renderers and if nothing works I will just turn the figures to static SVG those seem to render just fine. |
@OriolAbril do you know how we choose which on the graphs will be used for the thumbnail? It would be great to have one of the maps, they're really attractive. |
… with A and sometimes with T to only being referred to using T
IIRC, it is the last image in the notebook. It is probablyrestricted to png only though, or maybe png/jpeg only. The code that takes care of that is in https://github.com/pymc-devs/pymc-examples/blob/main/sphinxext/thumbnail_extractor.py. I am quite sure there are other notebooks that add an extra last cell that duplicates the plot they want shown and then remove that cell from the rendered view. I think that is significantly easier than adding extra metadata to the notebooks to choose the image and updating the sphinx extension to take that into account. |
Thanks @OriolAbril, I will give this a shot later today! |
Case Study to showcase the use of the State Space Module in
pymc-extras
This relates to #715
In this case study I have done the following:
I still wish to add the following:
@jessegrabowski Once I have completed this, would you be willing to be a reviewer for this case study?
EDIT:
I am thinking about skipping the VARMAX model section as this is starting to get a bit long, in my opinion.
I am open to feedback from others on whether to skip or add this section.
EDIT 2:
This is now ready for review.
EDIT 3:
I need some help with referencing a notebook in a repo. Specifically, this
I've read through the Jupyter Style documentation but it is still not apparent to me how to add a reference to a notebook in a repo that doesn't have a published sphinx documentation page.
📚 Documentation preview 📚: https://pymc-examples--763.org.readthedocs.build/en/763/