Skip to content

spec : save the dynamic/static ngram cache file#22055

Draft
petersid2022 wants to merge 1 commit intoggml-org:masterfrom
petersid2022:self-speculation-save-cache
Draft

spec : save the dynamic/static ngram cache file#22055
petersid2022 wants to merge 1 commit intoggml-org:masterfrom
petersid2022:self-speculation-save-cache

Conversation

@petersid2022
Copy link
Copy Markdown
Contributor

@petersid2022 petersid2022 commented Apr 17, 2026

Overview

  • When we select the COMMON_SPECULATIVE_TYPE_NGRAM_CACHE speculative implementation we create a new common_speculative_state_ngram_cache state using create_state_ngram_cache, where we instantiate the new state by specifying various parameters (e.g, n_draft, save_static and save_dynamic) by hardcoding them.

  • Instead we extend common_params_speculative to include those options as well.

  • An attempt was also made to implement the save_static / save_dynamic behavior by calling common_ngram_cache_save on object destruction.

Additional information

Add self‑speculative decoding (no draft model required)#18471

Requirements

@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch from 2e1c956 to 430c0ca Compare April 18, 2026 13:14
@petersid2022 petersid2022 marked this pull request as ready for review April 18, 2026 13:37
@petersid2022 petersid2022 requested a review from a team as a code owner April 18, 2026 13:37
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 2 times, most recently from d5448ea to ba99720 Compare April 20, 2026 05:49
@petersid2022 petersid2022 requested review from a team, CISC, IMbackK, ggerganov and pwilkin as code owners April 20, 2026 05:49
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 4 times, most recently from cf7a308 to 8ae6c04 Compare April 20, 2026 06:59
@CISC CISC removed request for a team, CISC, IMbackK and pwilkin April 20, 2026 08:03
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 2 times, most recently from afc3295 to dc2ab62 Compare April 20, 2026 18:29
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 5 times, most recently from c402b3d to 9da23a4 Compare April 21, 2026 18:34
@petersid2022 petersid2022 changed the title spec: save the dynamic/static ngram cache file spec : save the dynamic/static ngram cache file Apr 21, 2026
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 9 times, most recently from 89b10b8 to 5c5bea4 Compare April 29, 2026 11:12
Comment thread common/common.h
};

struct common_params_speculative_ngram_cache {
struct common_params_speculative_ngram_cache : common_params_speculative_ngram_map {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the wrong way of going about this, but I am curious if the same concept of m-gram speculative tokens can be applied in the ngram-cache implemetantion

@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch 3 times, most recently from 4d256ae to e3017a4 Compare April 30, 2026 13:03
* fix todo on providing n_draft, save_static and save_dynamic from common/common.h

* implement the functionality by saving the cache at the common_speculative_state_ngram_cache destruction
@petersid2022 petersid2022 force-pushed the self-speculation-save-cache branch from e3017a4 to 268d95e Compare May 1, 2026 09:55
@ggerganov
Copy link
Copy Markdown
Member

The new parameters are never populated. Did you test this change? What is the goal of this PR?

@ggerganov ggerganov marked this pull request as draft May 1, 2026 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants