GLM 4.5 Air much slower since an update. Still not fixed so I roll back to a previous version. Big drop in speed from 6 tk/s to 0.5 #15840
-
I am on Linux Mint 22, I have an RTX 3070 Mobile with 64 GB of RAM and a Ryzen 5800H, and I use in llama-server (for Silly Tavern) Since an update between 5 of august and 14th (compiled from llama) for GLM 4.5 Air the tk/s in generation are around 0.5 token per second much slower than my previous usual of 6 tokens per second while if I revert back to versions like llama.cpp-9515c6131aecaccc955fdedcfe16c3e030aaefcb (last update 5th of August) is back to my normal speeds. Am I the only one experiencing this? Any help in knowing what is the cause of the problem? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Do a git bisect to identify the exact commit causing the issue. |
Beta Was this translation helpful? Give feedback.
Do a git bisect to identify the exact commit causing the issue.