Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Don't cache reinit_modules #5543

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs
when a `ConfigurationError` for missing GPU was raised.
- Load model on CPU post training to save GPU memory.
- Don't cache models with `cached_transformers` when `reinit_modules` is not `None`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to omit this actually since this feature hasn't been released yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

- Fixed a bug in `ShouldValidateCallback` that leads to valuation occuring after the first epoch regardless of `validation_start` value.
- Fixed a bug in `ShouldValidateCallback` that leads to valuation occuring every `validation_interval + 1` epochs, instead of every `validation_interval` epochs.

Expand Down
4 changes: 1 addition & 3 deletions allennlp/common/cached_transformers.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ class TransformerSpec(NamedTuple):
model_name: str
override_weights_file: Optional[str] = None
override_weights_strip_prefix: Optional[str] = None
reinit_modules: Optional[Union[int, Tuple[int, ...], Tuple[str, ...]]] = None


_model_cache: Dict[TransformerSpec, transformers.PreTrainedModel] = {}
Expand Down Expand Up @@ -66,9 +65,8 @@ def get(
model_name,
override_weights_file,
override_weights_strip_prefix,
reinit_modules,
)
transformer = _model_cache.get(spec, None)
transformer = None if reinit_modules is not None else _model_cache.get(spec, None)
if transformer is None:
if not load_weights:
if override_weights_file is not None:
Expand Down