Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Info for chunking to fix #52 #58

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Conversation

annefou
Copy link
Collaborator

@annefou annefou commented Nov 20, 2024

I started to draft the chunking section to fix #52.

@annefou
Copy link
Collaborator Author

annefou commented Nov 20, 2024

@tinaok I started to write some information about the chunking. It needs to be refined that is why I opened a draft PR.

@felixcremer
Copy link
Member

Feel free to incorporate parts of https://github.com/linamaes/chunking_tutorial if you find it useful.

@annefou
Copy link
Collaborator Author

annefou commented Nov 20, 2024

Feel free to incorporate parts of https://github.com/linamaes/chunking_tutorial if you find it useful.

Great! thank you!

@annefou
Copy link
Collaborator Author

annefou commented Nov 25, 2024

@clausmichele @tinaok what do you think about having an exercise on chunking?

We still need to find a way to reference the original tutorial in teh references (https://github.com/linamaes/chunking_tutorial )
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@annefou annefou marked this pull request as ready for review January 1, 2025 16:58
@tinaok
Copy link
Collaborator

tinaok commented Jan 5, 2025

@annefou end of the chunking_introduction_snow.ipynb, it says

Computations on big datasets can be very slow on a single computer, and to optimize its time we may need to parallelize your computations. This is what you will learn in the next episode with Dask.

Is this somewhere in this branch or it is in other branch? or it is not yet in this repo?

@annefou
Copy link
Collaborator Author

annefou commented Jan 5, 2025

We are using dask when creating the snow map but we need here to teach them how to use dask. It needs to be added in the same notebook.

@tinaok
Copy link
Collaborator

tinaok commented Jan 5, 2025

  • 2.4_exercises/chunking_introduction_snow.ipynb the setup section's package list, do we keep it? if yes we need to update the package list (for example we do not use matplotlib but use hvplot)

@tinaok
Copy link
Collaborator

tinaok commented Jan 5, 2025

We are using dask when creating the snow map but we need here to teach them how to use dask. It needs to be added in the same notebook.

I got the scaling_dask notebook, we can update to fit in the story.

@annefou annefou changed the title Draft info for chunking to fix #52 Info for chunking to fix #52 Jan 9, 2025
@clausmichele
Copy link
Member

@tinaok @annefou my review:

  • Paths: please use cubes-and-clouds/lectures/2.4_formats_and_performance/assets/ for images and auxiliary files and cubes-and-clouds/lectures/2.4_formats_and_performance/exercises/ for the notebooks.

Lecture (markdown):

  • Questions for the quiz are missing, please add a couple of questions.
  • Where would you like the user to start your notebooks? The exercises are not mentioned in the lecture.

Notebooks:

  • chunking_introduction_snow.ipynb and scaling_dask.ipynb:
    • there are too many things in them, remove what you already describe in the lecture. Good that you want to reuse content which was already developed, but you need to reduce the content, otherwise the users might feel overwhelmed.
    • the layout is not aligned with the other notebooks in the course.

@tinaok
Copy link
Collaborator

tinaok commented Jan 31, 2025

@clausmichele
other than this part

> ### Cost of scalability
> 
> Direct examples of computing on a workflow
> (todo: based on actual workflow)
> 
> ### Memory consumption
> 
> limitations and 
> 
> ### Difference between platform usage and cloud directly
> 
> TODO: Is this covered already in the platform lesson? Yes
> Using platforms removes complexity and adds abstraction layers

And additional Questions for quiz, it is done.
Can you please review?
thank you.
cc: @annefou

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2.4_formats_and_performance - Adding chunking here?
4 participants