Skip to content

Conversation

@be-marc
Copy link
Member

@be-marc be-marc commented Nov 11, 2023

No description provided.

@github-actions
Copy link

Preview

lxy = list(x, y)
object_size(lx)
object_size(lxy)
```
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the example because it is easy to understand but it is not clear where this happens in mlr3. Maybe mention the repeated storing of our objects in benchmark results.

```


When serializing `mlr3` objects, it can happen that their size is much larger than expected.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe start with non-technical term like storing mlr3 objects.

Because `Learner`s store the hyperparameters that were used for training in their `$state`, it is important to ensure that their size is small.
One cause for large sizes of parameter values is the presence of source references in the function's attributes.
Source references are kept when installing packages with the `--with-keep.source` option.
Note that this option is enabled by default when installing packages with `renv`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that installing with source refs is not the normal case but I would point out renv earlier. How can I check that mlr3 was installed with source refs?


Another -- easily amendable -- source for large object sizes is forgetting to set the right flags.
The list below contains some important configuration options that can be used to reduce the size of important `mlr3` objects:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the book. This is explained in detail there.

We will update this post as new problems come to our attention.
Note that while some of these issues might seem neglibile, they can cause serious problems when running large benchmrk experiments, e.g. using `mlr3batchmark`.

## Avoid Installating Packages With Source References
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention renv right in the heading.

@sebffischer
Copy link
Member

  • Use unserialize after serialize when measuring objects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants