Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kv_dq zero initialization to avoid NaNs from FA3 (#3632)
Summary: X-link: facebookresearch/FBGEMM#708 Pull Request resolved: #3632 Running evals with FP8 KV gives NaNs due to issues in FA3. For more context: D68708685 To reproduce: > sh ai_codesign/gen_ai/disagg_generator_launcher/start_server_moe.sh -m 17b_text_sft -a " --ffn_quantize_mode=fp8_rowwise --attn_quantize_mode=fp8_rowwise --kv_cache_quantization=8 " Mitigating these issues, change dequantize_fp8_cache initialization of output buffers from at::empty to at::zeros Reviewed By: jasonjk-park Differential Revision: D68574038 fbshipit-source-id: 3f3f5573d13f1b4046e6880363533eb1c2dfa268
- Loading branch information