[sycl-free-inference-for-llms] integrate triton gemm/attention in pytorch for xpu #2196

Dewei-Wang-sh · 2024-09-11T09:15:08Z

No description provided.

vlad-penkin · 2024-09-11T11:16:28Z

This ticket is part of umbrella ticket:

[sycl-free-inference-for-llms] Port and evaluate LLama3-8B and Granite-8B #2170

Dewei-Wang-sh changed the title ~~[sycl-free-inference-for-llms] integrate gemm/attention in pytorch for xpu~~ [sycl-free-inference-for-llms] integrate triton gemm/attention in pytorch for xpu Sep 11, 2024

Dewei-Wang-sh mentioned this issue Sep 11, 2024

[sycl-free-inference-for-llms] Port and evaluate LLama3-8B and Granite-8B #2170

Open

vlad-penkin added the enhancement New feature or request label Sep 11, 2024

vlad-penkin assigned alexbaden Sep 11, 2024

vlad-penkin added this to the 4.6 [Performance] E2E milestone Sep 11, 2024

vlad-penkin unassigned alexbaden Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sycl-free-inference-for-llms] integrate triton gemm/attention in pytorch for xpu #2196

[sycl-free-inference-for-llms] integrate triton gemm/attention in pytorch for xpu #2196

Dewei-Wang-sh commented Sep 11, 2024

vlad-penkin commented Sep 11, 2024

[sycl-free-inference-for-llms] integrate triton gemm/attention in pytorch for xpu #2196

[sycl-free-inference-for-llms] integrate triton gemm/attention in pytorch for xpu #2196

Comments

Dewei-Wang-sh commented Sep 11, 2024

vlad-penkin commented Sep 11, 2024