Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1068

Bug in Parameters count for GPTQ quantized model

#1068

by Qubitium - opened 1 day ago

Discussion

Qubitium

1 day ago

@alozowski I think I found a bug that is related to gptq and likely other quantized models in OpenLLM leaderboard. The parameters count appears off by a factor of 4? This gptq model of 3.2-1B instruct but filter will not show this model until I set to ~6B in the UI.

Leaderboard UI: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?params=-1%2C6&search=modelcloud
Model: https://huggingface.co/ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1

alozowski

Open LLM Leaderboard org 1 day ago

Hi @Qubitium ,

Thanks for opening the discussion here! Currently, the Leaderboard calculates the number of parameters for GPTQ models using this method (line 118). Could you suggest any improvements to this calculation? It would be appreciated

Qubitium

about 16 hours ago

@alozowski The current code assumes a factor of 8 which means it assumes gptq quants are 4bits (int4), and 8 of int4 packed into one int32. gptq can be various bits including 2, 3, 4, 8. So this is one bug.

But even if the current code does on surface correctly calculate the 4bit refactor I still don't know why the end-value is ~4x more than reality. I may have more time on Monday to fix this if you haven't fixed this already.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment