Replies: 1 comment
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
如题,我想知道int4 kvcache是如何做高效反量化的,做了kv cache的重排吗,还是先ldsm后做warp内的shuffle
Beta Was this translation helpful? Give feedback.
All reactions