Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support arbitrary metric logging from torchmetrics #677

Merged
merged 63 commits into from
Feb 12, 2025
Merged

Conversation

sichu2023
Copy link
Collaborator

@sichu2023 sichu2023 commented Feb 5, 2025

Description

Training and validation torchmetrics.Metric can is organized by TorchmetricsConfig. This encapsulates metric class instantiation and naming through get_metric_name instead of using field_factory.

Currently model parallelism is not supported and will raise NotImplementedError.

Type of changes

  • New feature (non-breaking change which adds functionality)

@sichu2023
Copy link
Collaborator Author

/build-ci

@sichu2023 sichu2023 self-assigned this Feb 5, 2025
@sichu2023 sichu2023 force-pushed the sichu/metric-config branch 2 times, most recently from 44e7217 to a79fc4e Compare February 5, 2025 08:49
@codecov-commenter
Copy link

codecov-commenter commented Feb 5, 2025

Codecov Report

Attention: Patch coverage is 90.16393% with 6 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@9cec09f). Learn more about missing BASE report.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...-packages/bionemo-llm/src/bionemo/llm/lightning.py 84.00% 4 Missing ⚠️
...emo-esm2/src/bionemo/esm2/scripts/finetune_esm2.py 85.71% 1 Missing ⚠️
...ionemo-esm2/src/bionemo/esm2/scripts/train_esm2.py 85.71% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #677   +/-   ##
=======================================
  Coverage        ?   86.27%           
=======================================
  Files           ?      119           
  Lines           ?     7249           
  Branches        ?        0           
=======================================
  Hits            ?     6254           
  Misses          ?      995           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@farhadrgh farhadrgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tentatively approve to unblock, but the tests are failing, and I wasn’t able to experiment with it.

Copy link
Collaborator

@pstjohn pstjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get some tests for these new classes? Would be great to have unit tests around MetricConfig that show how it's used, as well as a short training run with some simple model that ensures it gets serialized correctly, produces the right results, etc.

@sichu2023 sichu2023 requested a review from farhadrgh February 5, 2025 22:06
@sichu2023 sichu2023 requested a review from dorotat-nv February 6, 2025 17:36
@sichu2023 sichu2023 force-pushed the sichu/metric-config branch from ecc9f67 to 4d61dc6 Compare February 6, 2025 21:19
@sichu2023 sichu2023 enabled auto-merge February 6, 2025 23:10
dorotat-nv and others added 2 commits February 7, 2025 22:15
### Description
Added templates for issues related to bug reporting and feature
requesting. Follows guidance in
https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository

### Type of changes
<!-- Mark the relevant option with an [x] -->

- [ ]  Bug fix (non-breaking change which fixes an issue)
- [ ]  New feature (non-breaking change which adds functionality)
- [ ]  Refactor
- [ ]  Documentation update
- [x]  Other (please describe): UI

### CI Pipeline Configuration
Configure CI behavior by applying the relevant labels:

-
[SKIP_CI](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#skip_ci)
- Skip all continuous integration tests
-
[INCLUDE_NOTEBOOKS_TESTS](https://github.com/NVIDIA/bionemo-framework/blob/main/docs/docs/user-guide/contributing/contributing.md#include_notebooks_tests)
- Execute notebook validation tests in pytest

> [!NOTE]
> By default, the notebooks validation tests are skipped unless
explicitly enabled.

### Usage
<!--- How does a user interact with the changed code -->
```python
TODO: Add code snippet
```

### Pre-submit Checklist
<!--- Ensure all items are completed before submitting -->

 - [ ] I have tested these changes locally
 - [ ] I have updated the documentation accordingly
 - [ ] I have added/updated tests as needed
 - [ ] All existing tests pass successfully

Signed-off-by: sichu <[email protected]>
skip running our test pipeline on PRs marked as "draft" to save CI resources

Signed-off-by: Peter St. John <[email protected]>
Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
This reverts commit a180864.

Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
Signed-off-by: sichu <[email protected]>
@sichu2023 sichu2023 added this pull request to the merge queue Feb 11, 2025
Merged via the queue into main with commit 486fd57 Feb 12, 2025
7 checks passed
@sichu2023 sichu2023 deleted the sichu/metric-config branch February 12, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants