Skip to content

Commit 7042d7a

Browse files
sudhakarsingh27pre-commit-ci[bot]cyanguwa
authored
TE Gemma tutorial attempt#2 (#1839)
* add tutorial files and other local changes Signed-off-by: Sudhakar Singh <[email protected]> * remove extraneous code for easy debu Signed-off-by: Sudhakar Singh <[email protected]> * make cuda graphs work with non-paged and paged attention Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * perf imp for kv cache ops Signed-off-by: Sudhakar Singh <[email protected]> * add code for calibration Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * optimize kv_cache reindex and copy kernels Signed-off-by: Charlene Yang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * changes to make quantizers work with fp8_calibration Signed-off-by: Sudhakar Singh <[email protected]> * avoid reindexing from python side Signed-off-by: Charlene Yang <[email protected]> * rename variable from previous commit Signed-off-by: Charlene Yang <[email protected]> * minor fix Signed-off-by: Charlene Yang <[email protected]> * minor fix Signed-off-by: Charlene Yang <[email protected]> * use quantizer only if needed Signed-off-by: Sudhakar Singh <[email protected]> * functionality of the tutorial tested and perf checked Signed-off-by: Sudhakar Singh <[email protected]> * remove files and update headers/licenses Signed-off-by: Sudhakar Singh <[email protected]> * update header/license Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update tutorial for review Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * make weights downloadable on the fly; remove extra print statements Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix lint and update comments Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comma back, typo Signed-off-by: Sudhakar Singh <[email protected]> * sequence_start_positions should be None for training Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add paged attention numberes and update requirements.txt file Signed-off-by: Sudhakar Singh <[email protected]> * more fixes Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * make tutorial work on blackwell Signed-off-by: Sudhakar Singh <[email protected]> * remove gemma FT tutorial for now Signed-off-by: Sudhakar Singh <[email protected]> * fixing the headings placement and rewording attention -> kv caching Signed-off-by: Sudhakar Singh <[email protected]> * fixes from comments Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix the images Signed-off-by: Sudhakar Singh <[email protected]> * misc fixes Signed-off-by: Sudhakar Singh <[email protected]> * add more comments to te_gemma.py and cleanup utils.py Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add more information about the hierarchy of the classes used in the tutorial Signed-off-by: Sudhakar Singh <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add better cuda graphs picture Signed-off-by: Sudhakar Singh <[email protected]> * addd updated cuda graphs pictures Signed-off-by: Sudhakar Singh <[email protected]> * add illustrated cuda graphs Signed-off-by: Sudhakar Singh <[email protected]> * fix Signed-off-by: Sudhakar Singh <[email protected]> * small fixes in documentation Signed-off-by: Sudhakar Singh <[email protected]> * add torch.no_grad() to force reduced memory usage Signed-off-by: Sudhakar Singh <[email protected]> * some fixes from recent comments Signed-off-by: Sudhakar Singh <[email protected]> * more fixes from remaining comments Signed-off-by: Sudhakar Singh <[email protected]> * add te_rope_emb to class desc Signed-off-by: Sudhakar Singh <[email protected]> * fix tutorial wording; add calibration fix to grouped_linear.py Signed-off-by: Sudhakar Singh <[email protected]> --------- Signed-off-by: Sudhakar Singh <[email protected]> Signed-off-by: Charlene Yang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Charlene Yang <[email protected]>
1 parent ba37529 commit 7042d7a

23 files changed

+5152
-33
lines changed

docs/examples/te_gemma/media/calibration.svg

Lines changed: 620 additions & 0 deletions
Loading

docs/examples/te_gemma/media/calibration_1_half.svg

Lines changed: 415 additions & 0 deletions
Loading

docs/examples/te_gemma/media/calibration_2_half.svg

Lines changed: 401 additions & 0 deletions
Loading

docs/examples/te_gemma/media/fp8_model_init.svg

Lines changed: 500 additions & 0 deletions
Loading
Lines changed: 358 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)