Skip to content

Add/156 dynamic cache resize #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

theprashasst
Copy link

@theprashasst theprashasst commented Mar 16, 2025

Fixes #156
This PR addresses the issue with the resize_cache method in the ChatSampler class, ensuring that the cache tensor is properly resized when the cache length is updated. The changes include:

Addition of resize_cache Function:

A new function resize_cache is introduced to handle resizing of cache with the help of resize_tesnsor. This function ensures that the cache tensor is resized correctly, either by padding with zeros or truncating as needed.
Modification of resize_cache Method:

The resize_cache method is added along with the resize_tensor function for resizing the cache tensors.
The method now initializes a new cache with the updated size and copies the resized cache data into the new cache.
A new SamplingState is created with the updated cache and other fields copied from the existing last_state.
The last_state is updated with the new SamplingState.
The cached sampler is invalidated so that it gets recomputed with the new cache size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamic cache resize
1 participant