Skip to content

ROCm and 8-bit quantization #1245

@DavideRossi

Description

@DavideRossi

System Info

An AMD Epyc system with 3 MI210.
Quite a complex setup. The system uses slurm to schedule batch jobs which are usually in the form of apptainer run containers. The image I'm using has rocm6.0.2 on ubuntu22.04.

Reproduction

python -m bitsandbytes

CUDA specs: CUDASpecs(highest_compute_capability=(9, 0), cuda_version_string='61', cuda_version_tuple=(6, 1))
PyTorch settings found: CUDA_VERSION=61, Highest Compute Capability: (9, 0).
WARNING: CUDA versions lower than 11 are currently not supported for LLM.int8().
You will be only to use 8-bit optimizers and quantization routines!
To manually override the PyTorch CUDA version please see: https://github.com/TimDettmers/bitsandbytes/blob/main/docs/source/nonpytorchcuda.mdx
CUDA SETUP: WARNING! CUDA runtime files not found in any environmental path.

Two issues here: CUDA_VERSION here is not 61, that's the ROCm version (6.1), the cuda version is the hell knows what since torch.version.cuda is None on ROCm.
As a result the "lower than 11" makes little sense in this case.
Second issue: https://github.com/TimDettmers/bitsandbytes/blob/main/docs/source/nonpytorchcuda.mdx leads nowhere.
That leaves me wondering whether 8-bit on ROCm is really supported or not.

OK, let's try to run some code then:

model = AutoModelForCausalLM.from_pretrained(checkpoint, attn_implementation="eager", quantization_config=BitsAndBytesConfig(load_in_8bit=True))
outputs = model.generate(inputs)

Result:

[...]
Exception: cublasLt ran into an error!

See #538.
But now the question is: it's really the case that the existing 8-bit code is not supported on ROCm, or is it a case of architecture/libraries mismatch and 8-bit could actually work?

Expected behavior

This might be a bug, or it might not. I've not been able to find specific documentation on this. It seems to me like it's possible that 8 bit quantization could actually work but the code to detect if the architecture is supported has issues. Or it may be the case that I can forget about 8 bit on ROCm. But at least I would know it for sure.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions