QQQ OOM

https://github.com/HandH1998/QQQ/blob/e307d9f00b90309069733890f28eefe9886bb6f2/QQQ/gptq/models/llama.py#L56C5-L58C36

To avoid an Out-of-Memory (OOM) error when quantizing with default commands, which activate all data on a single GPU,  can we optimize this process in several ways？