Skip to content

Actions: mlc-ai/mlc-llm

Build Docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
628 workflow runs
628 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[Serving] Tree drafting (#2879)
Build Docs #528: Commit 20e7c4c pushed by MasterJH5574
September 7, 2024 01:13 6m 43s main
September 7, 2024 01:13 6m 43s
[Bench] Truncate to model max length when tokenizing (#2881)
Build Docs #527: Commit de6e916 pushed by tqchen
September 4, 2024 18:51 7m 19s main
September 4, 2024 18:51 7m 19s
[Engine] Preparation for switching between spec-decode mode and norma…
Build Docs #526: Commit cbaa2d0 pushed by MasterJH5574
September 2, 2024 14:06 7m 3s main
September 2, 2024 14:06 7m 3s
[Fix] Fix RWKV v6 weights loading for 7B/14B models (#2874)
Build Docs #525: Commit 91d861d pushed by MasterJH5574
September 2, 2024 14:05 7m 2s main
September 2, 2024 14:05 7m 2s
[Conv] Fix Qwen2 conv template (#2872)
Build Docs #524: Commit 55c2d9a pushed by MasterJH5574
September 1, 2024 19:48 6m 38s main
September 1, 2024 19:48 6m 38s
[Fix] Attribute mismatch in Phi-3-Vision Model (#2862)
Build Docs #523: Commit 52b37e8 pushed by MasterJH5574
August 28, 2024 21:13 6m 40s main
August 28, 2024 21:13 6m 40s
[Fix] Update seq len info after prefix cache operation (#2860)
Build Docs #522: Commit ab096ce pushed by MasterJH5574
August 28, 2024 17:53 7m 12s main
August 28, 2024 17:53 7m 12s
[Python] Reuse KVCache classes from TVM (#2857)
Build Docs #521: Commit 5a1bd3c pushed by MasterJH5574
August 28, 2024 16:41 6m 35s main
August 28, 2024 16:41 6m 35s
[Fix] Improve prefill policy for prefix cache reuse (#2859)
Build Docs #520: Commit 66730dc pushed by MasterJH5574
August 27, 2024 13:25 7m 2s main
August 27, 2024 13:25 7m 2s
[Model] Use attention kernel implemented by tir for CLIP vision encod…
Build Docs #519: Commit ae0853e pushed by MasterJH5574
August 26, 2024 23:36 7m 10s main
August 26, 2024 23:36 7m 10s
[MultiGPU] Fix pipeline parallelism prompt message (#2855)
Build Docs #518: Commit 7264faa pushed by MasterJH5574
August 26, 2024 04:24 7m 15s main
August 26, 2024 04:24 7m 15s
[Preset] Add Phi-3.5-mini to preset (#2845)
Build Docs #517: Commit 83d0fe3 pushed by MasterJH5574
August 24, 2024 20:00 7m 38s main
August 24, 2024 20:00 7m 38s
[Model] Remove hidden size assertion in LLaMA model (#2848)
Build Docs #516: Commit 9336aab pushed by MasterJH5574
August 24, 2024 19:59 6m 53s main
August 24, 2024 19:59 6m 53s
[Docs] Update installation instructions for ROCm (#2850)
Build Docs #515: Commit 42cf2ad pushed by MasterJH5574
August 24, 2024 19:58 7m 15s main
August 24, 2024 19:58 7m 15s
[ROCm] hipBLAS integration (#2830)
Build Docs #514: Commit 3869fa1 pushed by tqchen
August 23, 2024 15:48 6m 35s main
August 23, 2024 15:48 6m 35s
[Android] Update Andorid APK (#2842)
Build Docs #513: Commit 935be14 pushed by mengshyu
August 23, 2024 03:07 7m 18s main
August 23, 2024 03:07 7m 18s
[SLM] MiniCPM Multi-GPU support (#2826)
Build Docs #512: Commit 9f87508 pushed by MasterJH5574
August 22, 2024 21:04 7m 11s main
August 22, 2024 21:04 7m 11s
[Docs] Clarifying the number of additional model (#2841)
Build Docs #511: Commit c5dbae2 pushed by MasterJH5574
August 22, 2024 20:47 9m 4s main
August 22, 2024 20:47 9m 4s
[iOS] Add Phi3.5 into mlc-package-config.json (#2840)
Build Docs #510: Commit fd06a42 pushed by MasterJH5574
August 22, 2024 20:27 6m 29s main
August 22, 2024 20:27 6m 29s
[Model] Support Phi-3.5 (#2839)
Build Docs #509: Commit 25e3bad pushed by MasterJH5574
August 22, 2024 20:25 6m 27s main
August 22, 2024 20:25 6m 27s
Update the length of decode token in eagle mode (#2834)
Build Docs #508: Commit 761583e pushed by vinx13
August 22, 2024 19:54 7m 8s main
August 22, 2024 19:54 7m 8s
[Op] Improve the RoPE embedding implementation (#2816)
Build Docs #507: Commit 73bfbe6 pushed by tqchen
August 20, 2024 00:35 6m 34s main
August 20, 2024 00:35 6m 34s
[Sampler] Fix CPU sampler for speculative decoding (#2824)
Build Docs #506: Commit 6a5f63b pushed by tqchen
August 20, 2024 00:35 6m 52s main
August 20, 2024 00:35 6m 52s
[MultiGPU] Enable pre-sharding for pipeline parallelism (#2825)
Build Docs #505: Commit 78494f6 pushed by tqchen
August 19, 2024 23:02 7m 16s main
August 19, 2024 23:02 7m 16s
[MultiGPU] Fix memory estimation for pipeline parallelism (#2823)
Build Docs #504: Commit ee97167 pushed by MasterJH5574
August 19, 2024 21:03 15m 46s main
August 19, 2024 21:03 15m 46s