Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use branches models ? #476

Open
N1h1lv5 opened this issue Sep 14, 2023 · 7 comments
Open

How to use branches models ? #476

N1h1lv5 opened this issue Sep 14, 2023 · 7 comments

Comments

@N1h1lv5
Copy link

N1h1lv5 commented Sep 14, 2023

Hello everyone,

Lets say i want to use - TheBloke/Llama-2-Coder-7B-GPTQ
That model has a few branches such as main, gptq-4bit-32g-actorder_True, gptq-8bit--1g-actorder_True etc, if you scroll down.

eg :
https://huggingface.co/TheBloke/Llama-2-Coder-7B-GPTQ
vs
https://huggingface.co/TheBloke/Llama-2-Coder-7B-GPTQ/tree/gptq-8bit--1g-actorder_True

every branch has its own 'model.safetensors', how do make localGPT use one of the branches and not 'main' as default ?

Thanks in advance and sorry if this information is already somewhere, i couldnt find it.

@N1h1lv5
Copy link
Author

N1h1lv5 commented Sep 14, 2023

anyone ? from https://huggingface.co/TheBloke/Llama-2-13B-LoRA-Assemble-GPTQ

`model_name_or_path = "TheBloke/Llama-2-13B-LoRA-Assemble-GPTQ"

To use a different branch, change revision

For example: revision="main"

model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
device_map="auto",
trust_remote_code=False,
revision="main")`

I tried to implement revision but it doesnt work...

@N1h1lv5
Copy link
Author

N1h1lv5 commented Sep 17, 2023

-removed- using now gguf models instead of GPTQ. sometimes getting empty answers.

Question:
what is in the text ?

Answer:

Enter a query:`

I ask myself if the quantized models are working differently than the rest, which the code is still not suitable for those ?

@PromtEngineer
Copy link
Owner

@N1h1lv5 I am running into the same issue, empty answers. Still debugging it. It might be related to the promptTemplate. Will debug.

@N1h1lv5
Copy link
Author

N1h1lv5 commented Sep 18, 2023

@N1h1lv5 I am running into the same issue, empty answers. Still debugging it. It might be related to the promptTemplate. Will debug.

You are the best ! thanks !

@Calamaroo
Copy link

Has anyone found a workaround for this? There are a few models I'd like to try but I need to use a branch other than 'main' due to VRAM constraints. Everything I've attempted so far either downloads the main branch or fails to find the specified model.

@daniellefisla
Copy link

daniellefisla commented Mar 4, 2024

Has anyone found a workaround for this? There are a few models I'd like to try but I need to use a branch other than 'main' due to VRAM constraints. Everything I've attempted so far either downloads the main branch or fails to find the specified model.

I am wondering this as well. One way would be is to use huggingface cli to download the model manually.

You need to modify the code to use the revision parameter, as below.
I am going to implement this locally, will add REVISION to MODEL_ID and MODEL_BASENAME.

model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
revision="gptq-4bit-32g-actorder_True",
model_basename=model_basename,
use_safetensors=True,
trust_remote_code=True,
device="cuda:0",
quantize_config=None)

@daniellefisla
Copy link

daniellefisla commented Mar 5, 2024

Created a PR to support model branches: #765 @PromtEngineer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants