|
| 1 | +--- |
| 2 | +title: RSECon 2024 Highlights |
| 3 | +author: |
| 4 | + - name: Joe Marsh Rossney |
| 5 | + email: joemar@ceh.ac.uk |
| 6 | + - name: Jo Walsh |
| 7 | + email: jowals@ceh.ac.uk |
| 8 | + - name: Matt Brown |
| 9 | + email: matbro@ceh.ac.uk |
| 10 | + - name: Matt Coole |
| 11 | + email: matcoo@ceh.ac.uk |
| 12 | +date: today |
| 13 | +date-format: full |
| 14 | +format: |
| 15 | + revealjs: |
| 16 | + logo: ../../img/logo.png |
| 17 | + smaller: true |
| 18 | + scrollable: true |
| 19 | + progress: true |
| 20 | + embed-resources: true |
| 21 | +--- |
| 22 | + |
| 23 | +# Joe's highlights |
| 24 | + |
| 25 | +* * * |
| 26 | + |
| 27 | +### CodeRefinery workshops |
| 28 | + |
| 29 | +> _"We train you in research software development"_ |
| 30 | +
|
| 31 | +[coderefinery.org](https://coderefinery.org) | [github.com/coderefinery](https://github.com/coderefinery) | [nordic-rse.org](https://nordic-rse.org/) |
| 32 | + |
| 33 | +- Live-streamed workshops on twitch, similar to [the carpentries](https://carpentries.org/index.html) |
| 34 | +- Hybrid 'bring your own classroom' format - like a 'watch party' |
| 35 | +- Previous workshops uploaded to [youtube.com/\@coderefinery3414](https://www.youtube.com/\@coderefinery3414) |
| 36 | +- Publically funded (by Nordic research council), run as a community project |
| 37 | +- They seem incredibly open and seeking collaboration, see e.g. [coderefinery.org/tasks](https://coderefinery.org/tasks/) |
| 38 | + |
| 39 | +#### For RSEs & other instructors... |
| 40 | + |
| 41 | +- They run instructor training workshops (help us become better teachers) |
| 42 | +- RSEs could provide workshops via their platform, so that our colleagues at other organisations can benefit |
| 43 | + |
| 44 | +* * * |
| 45 | + |
| 46 | +### Carbon cost of software |
| 47 | + |
| 48 | +#### Carbon cost calculator from [green-algorithms.org](https://www.green-algorithms.org/) |
| 49 | + |
| 50 | +- Input details of algorithm, runtime, hardware to get a CO~2~ estimate |
| 51 | +- Inevitably _very_ large errors on estimates, particularly for HPC |
| 52 | +- See [doi.org/10.1002/advs.202100707](https://onlinelibrary.wiley.com/doi/10.1002/advs.202100707) for methodology |
| 53 | + |
| 54 | +#### A Python package [codecarbon.io](https://codecarbon.io) for _in situ_ estimates |
| 55 | + |
| 56 | +- Speaker (from STFC) used `codecarbon` to add CO~2~ estimate to existing tool for benchmarking optimisation algorithms: [github.com/fitbenchmarking](https://github.com/fitbenchmarking/fitbenchmarking) |
| 57 | + |
| 58 | +#### An HPC case study ([archer2.ac.uk](https://www.archer2.ac.uk/)) |
| 59 | + |
| 60 | +- Surprising claim: energy efficiency is the wrong metric as renewables increasingly dominate electricity generation. |
| 61 | +- Instead, aim to maximise life-span and system utilisation. |
| 62 | +- ARCHER 2 aiming to provide emissions info to users (Jasmin could do the same!) |
| 63 | + |
| 64 | +* * * |
| 65 | + |
| 66 | +### Reproducible development environments |
| 67 | + |
| 68 | +- Speaker reported on their experience using [jetify.com/devbox](https://www.jetify.com/devbox), a CLT for generating dev isolated dev environments based on NixOS (see [nixos.org](https://nixos.org/)) |
| 69 | +- Just modifies `$PATH` (no virtualisation) --- very similar to `cargo`, `poetry`, `pixi` in both principle and practice |
| 70 | +- `devbox.lock` contains everything needed to reproduce environment _exactly_ |
| 71 | + |
| 72 | +```sh |
| 73 | +devbox init # creates devbox.json |
| 74 | +devbox add git # installs git@latest & adds it to devbox.json |
| 75 | +devbox add python@3.12 # same for Python 3.12 |
| 76 | +devbox shell # activates the shell |
| 77 | +``` |
| 78 | + |
| 79 | +- Nix lets you build an entire OS deterministically from a configuration file --- consistent environment on local host, CI, VM |
| 80 | +- Unfortunately Nix does not build packages with non-free backends, so e.g. PyTorch with MKL & CUDA is difficult and tedious |
| 81 | +- Doesn't work on Windows |
| 82 | + |
| 83 | +* * * |
| 84 | + |
| 85 | +### Notable mentions |
| 86 | + |
| 87 | +#### Weather & climate RSEs - discussion |
| 88 | + |
| 89 | +- Participants made notes during the session: [hackmd.io/W9YAQowdSSqJ2RELJEd5tw](https://hackmd.io/W9YAQowdSSqJ2RELJEd5tw) |
| 90 | +- Join the new channel `#weather-climate` in [ukrse.slack.com](https://ukrse.slack.com/) |
| 91 | +- Subscribe to the mailing list using the google form here: [tinyurl.com/49x7c4fc](https://tinyurl.com/49x7c4fc) |
| 92 | +- Join the Special Interest Group using the same form |
| 93 | + |
| 94 | +#### Framework for scaling up reproducible practices in research organisations |
| 95 | + |
| 96 | +- Framework ([zenodo.org/records/10664660](https://zenodo.org/records/10664660)) based on a mixed-methods study [zenodo.org/records/10663903](https://zenodo.org/records/10663903) |
| 97 | +- Dimensions considered: tools, training, incentives, mentors, feedback, expert involvement, policies and procedures |
| 98 | +- In most cases the interesting and difficult part is the 'scaling up' part! |
| 99 | +- Also discussed 'good enough practices in scientific computing' --- [doi.org/10.1371/journal.pcbi.1005510](https://doi.org/10.1371/journal.pcbi.1005510) |
| 100 | + |
| 101 | +<!-- and suggested interventions from [bioRxiv 2022.12.08.519666](https://www.biorxiv.org/content/10.1101/2022.12.08.519666v1) --> |
| 102 | + |
| 103 | + |
| 104 | +# Jo's highlights |
| 105 | + |
| 106 | +* * * |
| 107 | + |
| 108 | +### "Scaling reproducibility" influencing change workshop |
| 109 | + |
| 110 | +- Discussion-focused look at the detail of [the framework](https://zenodo.org/records/10664660) |
| 111 | +- Assess your org's maturity levels against different criteria: |
| 112 | +- 'Locus of leadership', 'Communities of Practise', 'Tools', 'Education and training', 'Incentives', 'Modelling and mentoring', 'Review and feedback', 'Expert involvement', 'Policies and procedures' |
| 113 | +- Helps pick where to focus effort for most payoff of impact |
| 114 | +- Helps map a path about where to improve, without despairing where you are now |
| 115 | +- Leaves you reflecting that the benefits of RSE work are more culture than code |
| 116 | + |
| 117 | +* * * |
| 118 | + |
| 119 | +### Mutation testing - who tests the testers? |
| 120 | + |
| 121 | +- Met Office use of [mutmut](https://mutmut.readthedocs.io/en/latest/index.html) to stress their python tests |
| 122 | +- In essence, change the meaning of one line of code, if the tests still pass, they're weak |
| 123 | +- Helps think twice about what your code is doing, not only how you're testing |
| 124 | +- Useful for big legacy codebases as well as new work |
| 125 | + |
| 126 | +* * * |
| 127 | + |
| 128 | +### "Reproducible distributed research in practice" |
| 129 | + |
| 130 | +- Next-level approach to distributed data-versioning and workflow tools |
| 131 | +- Active development by [RESIDE](https://reside-ic.github.io/) group at the MRC Centre at Imperial |
| 132 | +- "leans on metaphors from containerisation" - best of git, docker and `{targets}` |
| 133 | +- Language-agnostic, with a `[pyorderly](https://github.com/mrc-ide/pyorderly)` equivalent to R's `orderly` |
| 134 | + |
| 135 | +* * * |
| 136 | + |
| 137 | +### Special mentions! |
| 138 | + |
| 139 | +- The [YeSTEM Equity Compass](https://yestem.org/tools/the-equity-compass/) from the opening keynote |
| 140 | +- Early morning classes! Bhangra dancing, meditation |
| 141 | +- Quiet Room for neurodiverse people to decompress in |
| 142 | +- Lots of potential for more conversational, self-organising activity |
| 143 | + |
| 144 | + |
| 145 | +# Matt B's highlights |
| 146 | + |
| 147 | +* * * |
| 148 | + |
| 149 | +### Highlights |
| 150 | +- [Document checklist/assessment criteria](https://docs.google.com/document/d/1NuBSkRCY3wpLmuM0kGmZA8wPxTnqu4Ow6B-Ay1_vgoM/edit) and [worksheet](https://cehacuk.sharepoint.com/:x:/r/sites/ResearchSoftwareEngineeringCommunity/_layouts/15/Doc.aspx?sourcedoc=%7BCEF59DB7-CC17-4FE7-8B13-2104DBD33D3D%7D&file=Reproducibility_Assessment_Worksheet.xlsx&wdOrigin=TEAMS-MAGLEV.p2p_ns.rwc&action=default&mobileredirect=true) for where organisations are for software reproducibility (Jo already covered nicely) |
| 151 | +- [RO-Crate](https://www.researchobject.org/ro-crate/). A way of packaging metadata and provenance with data. Could be useful for FDRI. Follow up with a [previous assessment](https://wiki.ceh.ac.uk/display/ad/ro-crate) of it conducted in the EDS team. [(Py)orderly](https://github.com/mrc-ide/pyorderly) is in a similar space too (already covered) |
| 152 | +- Great conversation with friendly folk at UKAEA, have had an RSE group for \~7years, very willing to help us set up ours |
| 153 | + |
| 154 | +* * * |
| 155 | + |
| 156 | +### The importance of boilerplate code |
| 157 | + |
| 158 | +- A reminder that this code can make big impacts! |
| 159 | +- Tricky to install? Use a CI workflow on GH to automatically upload to pypi |
| 160 | +- Hard to get (small) inputs into code? Collect in a data repository and auto-download from code |
| 161 | +- Doesn't work in X language? Write a wrapper around the code in X language |
| 162 | +- Some nice things to bear in mind in our work |
| 163 | + |
| 164 | +* * * |
| 165 | + |
| 166 | +### Training resources |
| 167 | + |
| 168 | +- Some [performance and profiling training resources](https://rse.shef.ac.uk/pando-python/index.html) which are nice-to-haves, something we can point out when reviewing code, use in our best practices when developing code and fold into any training where appropriate |
| 169 | +- Some [testing training resources](https://github.com/abhidg/advanced-python-testing) |
| 170 | +- Some [FAIR4RS (FAIR for Research Software) training resources](https://carpentries-incubator.github.io/fair-research-software/00-introduction.html) |
| 171 | +- [Coderefinery](https://www.coderefinery.org) "bring your own classroom" approach (mentioned by Joe earlier) |
| 172 | + |
| 173 | +* * * |
| 174 | + |
| 175 | +### Working with object storage and gridded data |
| 176 | +- Work at NOC that has close parallels with what I am trying to do in FDRI. [Presentation](https://docs.google.com/presentation/d/18RZSS0LEIYOHK0Qhh_UWaCh00lhfXlfO/edit#slide=id.p1) here, [streamlit-based visualisation app](https://github.com/NOC-OI/class_streamlit_zarr) here, cute little command line application aiming to simplify the netcdf -> rechunk -> zarr pipeline [here](https://github.com/NOC-OI/msm-os). |
| 177 | +- They also have a Data Science Platform that'd it'd be good to get to know better. Meeting being arranged! |
| 178 | + |
| 179 | +* * * |
| 180 | + |
| 181 | +### Honourable mentions |
| 182 | +- [A name change policy working group](https://ncpwg.org/resources/authors/) aiming to make it easier for researchers to change their names and not be harmed by it, given academia's focus on your publishing/contribution record |
| 183 | +- Which was mentioned in an EEDI "Birds of a Feather" session which had lots of other great inclusivity ideas! Notes from the session to be dsitributed once anonymised. |
| 184 | +- The friendly and accommodating vibe! |
| 185 | + |
| 186 | + |
| 187 | +# Matt C's highlights |
| 188 | + |
| 189 | +* * * |
| 190 | +### Technical Presentations |
| 191 | +- Performant Python Patterns ([Robert Chisholm](https://github.com/Robadob)) |
| 192 | + - [Python Profiling (Carpentries)](https://rse.shef.ac.uk/pando-python/) |
| 193 | + - Incredible performance of python built ins. |
| 194 | +- Advanced Python Testing ([Abhishek Dasgupta](https://github.com/abhidg)) |
| 195 | + - [`hypothesis`](https://hypothesis.works/) |
| 196 | + - [`syrupy`](https://github.com/syrupy-project/syrupy) |
| 197 | +- Demystifying Large Language Models ([Martin O'Reilly](https://github.com/martintoreilly)) |
| 198 | + - Shortcomings with bias, hallucinations, snapshot in time. |
| 199 | + - Dificulties in evaluating performance. |
| 200 | + |
| 201 | +* * * |
| 202 | +### Agile Methods for RSEs |
| 203 | +#### Manchester University ([Ann Gledson](https://github.com/AnnAnnFryingPan), [Adrian Harwood](https://github.com/aharwood2)) |
| 204 | +- Manchester Universities approach to managing projects using scrum and agile. |
| 205 | +- Importance of being flexible and truly agile based on engagement from stakeholders. |
| 206 | +- Tooling based around using github with customisations. |
| 207 | + |
| 208 | +#### Alan Turing Institute ([Carlos Gavidia-Calderon](https://github.com/cptanalatriste)) |
| 209 | +- Turing Institute's approach to agile - being adaptive based on projects. |
| 210 | +- Daily stand-ups difficult with researchers - weekly (focus on what's blocking). |
| 211 | +- Focus on what works for your team - not what other people do. |
| 212 | + |
| 213 | +* * * |
| 214 | +### Cookie cutters ([Carlos Martinez-Ortiz](https://github.com/c-martinez)) |
| 215 | +- Importance of templates for helping to adopt best practise. |
| 216 | +- Tooling beyond [Cookie Cutters](https://www.cookiecutter.io/) like [Copier](https://copier.readthedocs.io/en/stable/) which allow options and profiles. |
| 217 | +- Lots of existing templates to build on / collaborate with |
| 218 | + - [Materials Data Science and Informatics Template]( https://github.com/Materials-Data-Science-and-Informatics/fair-python-cookiecutter) |
| 219 | + - [Alan Turing Institute Template](https://github.com/alan-turing-institute/python-project-template/tree/main) |
| 220 | + |
| 221 | +* * * |
| 222 | +### Other highlights |
| 223 | +- Keynote ([Anne-Marie Imafidon](https://aimafidon.com/about/)) |
| 224 | + - [Invisible Women](https://carolinecriadoperez.com/book/invisible-women/) |
| 225 | +- A New RSE Group - The First 100 Days ([Jo Walsh](https://github.com/metazool)) |
| 226 | +- Weather & Climate RSE Community ([Jo Marsh Rossney](https://github.com/jmarshrossney)) |
| 227 | + |
| 228 | +::: {layout-ncol=3} |
| 229 | + |
| 230 | + |
| 231 | + |
| 232 | + |
| 233 | + |
| 234 | +::: |
| 235 | + |
| 236 | +# Useful links |
| 237 | + |
| 238 | +- Conference website: [rsecon24.society-rse.org/](https://rsecon24.society-rse.org/) |
| 239 | +- RSE Society youtube channel: [youtube.com/@SocRSE](https://www.youtube.com/@SocRSE/) |
| 240 | +- Our internal discussions: [github.com/NERC-CEH/rse_group/discussions/40](https://github.com/NERC-CEH/rse_group/discussions/40) |
0 commit comments