Skip to content

Conversation

adosar
Copy link
Contributor

@adosar adosar commented Apr 15, 2025

Related to #18951 (comment).

Update the pseudocode of validation loop according to #18951:

When the validation loop ends, and before switching to training, it restores the .training mode on all submodules to what it was before.

and add a corresponding note to {validation,test,predict}_step since they exhibit this behavior as can be seen in the following snippet:

from lightning.pytorch.demos.boring_classes import BoringModel
import lightning as L
import warnings

warnings.filterwarnings('ignore')

trainer = L.Trainer(max_epochs=1)
loop = trainer.test

litmodel = BoringModel()
litmodel.train()
print('Before loop', litmodel.training)
loop(litmodel)
print('After loop', litmodel.training)

litmodel = BoringModel()
litmodel.eval()
print('Before loop', litmodel.training)
loop(litmodel)
print('After loop', litmodel.training)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Before loop True
Testing DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:00<00:00, 1299.38it/s]
After loop True
Before loop False
Testing DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 64/64 [00:00<00:00, 1332.01it/s]
After loop False

@awaelchli Can you please confirm that this is the intended (default) behavior of the loops?

Additional changes:

  • Fix incorrect comment in lightning_module.rst that trainer.test(model) loads the best weights. According to the docs, If ckpt_path=None and the model instance was passed, use the current weights.

What does this PR do?

Fixes #<issue_number>

Before submitting
  • Was this discussed/agreed via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist
  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

📚 Documentation preview 📚: https://pytorch-lightning--20716.org.readthedocs.build/en/20716/

Update the pseudocode of validation loop according to Lightning-AI#18951:

> when the validation loop ends, and before switching to training, it
> restores the `.training mode` on all submodules to what it was before.

and add a corresponding note to `{validate,test,predict}_step`.

Additional changes:
* Fix incorrect comment in `lightning_module.rst` that
  `trainer.test(model)` loads the best weights.
@github-actions github-actions bot added docs Documentation related pl Generic label for PyTorch Lightning package labels Apr 15, 2025
Copy link

stale bot commented Jul 19, 2025

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://lightning.ai/docs/pytorch/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Discord. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Jul 19, 2025
@adosar
Copy link
Contributor Author

adosar commented Jul 19, 2025

cc @Borda

@@ -286,6 +286,9 @@ Under the hood, Lightning does the following (pseudocode):
# ...

if validate_at_some_point:
# capture .training mode of every submodule
capture_training_mode()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some examples of what the capture_training_mode and later restore_training_mode us used for? Now it's a bit confusing since you declare/use a cute function but a new function...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Borda I am thinking of adding a note in the docs instead of introducing these functions in the pseudocode of the validation loop. The take home message should be that Lightning takes care of ensuring that layers set in eval mode by user remain in eval mode.

@stale stale bot removed won't fix This will not be worked on labels Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related pl Generic label for PyTorch Lightning package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants