Add/156 dynamic cache resize #187
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #156
This PR addresses the issue with the resize_cache method in the ChatSampler class, ensuring that the cache tensor is properly resized when the cache length is updated. The changes include:
Addition of resize_cache Function:
A new function resize_cache is introduced to handle resizing of cache with the help of resize_tesnsor. This function ensures that the cache tensor is resized correctly, either by padding with zeros or truncating as needed.
Modification of resize_cache Method:
The resize_cache method is added along with the resize_tensor function for resizing the cache tensors.
The method now initializes a new cache with the updated size and copies the resized cache data into the new cache.
A new SamplingState is created with the updated cache and other fields copied from the existing last_state.
The last_state is updated with the new SamplingState.
The cached sampler is invalidated so that it gets recomputed with the new cache size.