Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to access togethercomputer/m2-bert-80M-32k-retrieval locally? #42

Closed
Huguet57 opened this issue Jan 2, 2025 · 6 comments
Closed

Comments

@Huguet57
Copy link

Huguet57 commented Jan 2, 2025

I am comparing the performance of three different approaches for the togethercomputer/m2-bert-80M-32k-retrieval model. Below are the results from each method. Based on the outcomes, the TogetherComputer API provides the best results. I need guidance on how to access this model locally.

Performance Comparison

1. TogetherComputer API

Code:

response = client.embeddings.create(
    model="togethercomputer/m2-bert-80M-32k-retrieval", input=text
)

Results:

Running comparison for version: api

Self-comparison check:
----------------------------------------
query.pt     | 1.0000 | 100.00%

Comparing with target embeddings:
----------------------------------------
target1.pt   | 0.8149 | 81.49%
target2.pt   | 0.5886 | 58.86%
target3.pt   | 0.3603 | 36.03%

Comparing with random embeddings:
----------------------------------------
random1.pt   | 0.0875 | 8.75%
random2.pt   | 0.0437 | 4.37%
random3.pt   | 0.0594 | 5.94%
random4.pt   | 0.0348 | 3.48%

2. HuggingFace

Code:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "togethercomputer/m2-bert-80M-32k-retrieval",
    trust_remote_code=True,
)

Results:

Running comparison for version: hf

Self-comparison check:
----------------------------------------
query.pt     | 1.0000 | 100.00%

Comparing with target embeddings:
----------------------------------------
target1.pt   | 0.6615 | 66.15%
target2.pt   | 0.4166 | 41.66%
target3.pt   | 0.2371 | 23.71%

Comparing with random embeddings:
----------------------------------------
random1.pt   | 0.1839 | 18.39%
random2.pt   | 0.0325 | 3.25%
random3.pt   | 0.0992 | 9.92%
random4.pt   | 0.0709 | 7.09%

3. HuggingFace v1

Code:

from transformers import BertConfig, AutoModelForMaskedLM

config = BertConfig.from_pretrained("hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1")
model = AutoModelForMaskedLM.from_pretrained(
    "hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1",
    config=config,
    trust_remote_code=True,
)

Results:

Running comparison for version: v1

Self-comparison check:
----------------------------------------
query.pt     | 1.0000 | 100.00%

Comparing with target embeddings:
----------------------------------------
target1.pt   | 0.4759 | 47.59%
target2.pt   | 0.5822 | 58.22%
target3.pt   | 0.4253 | 42.53%

Comparing with random embeddings:
----------------------------------------
random1.pt   | 0.0892 | 8.92%
random2.pt   | 0.2413 | 24.13%
random3.pt   | 0.2485 | 24.85%
random4.pt   | -0.0903 | -9.03%

Request for Help

The TogetherComputer API provides the best results in terms of similarity scores. However, I would like to access the same model locally for further experimentation and to avoid API dependency.

Question:
How can I download and use the togethercomputer/m2-bert-80M-32k-retrieval model locally, while achieving the same level of performance as with the TogetherComputer API?

Thank you!

@Huguet57 Huguet57 changed the title How to Access togethercomputer/m2-bert-80M-32k-retrieval Locally? How to access togethercomputer/m2-bert-80M-32k-retrieval locally? Jan 2, 2025
@DanFu09
Copy link
Collaborator

DanFu09 commented Jan 2, 2025

Give this script a try: https://github.com/HazyResearch/m2/blob/main/bert/embed_text.py

It prints out the values of the embedding, it should be the same between local and the API.

The API is serving the togethercomputer model (“V0”). The V1 under hazyresearch has been fine tuned on more diverse data, so performance may vary between those two models.

@Huguet57
Copy link
Author

Huguet57 commented Jan 3, 2025

It was the padding as @ezorita suggested in #43 , closing!

@Huguet57 Huguet57 closed this as completed Jan 3, 2025
@Huguet57
Copy link
Author

Huguet57 commented Jan 3, 2025

Thank you for the fast responses, @DanFu09!

The API is serving the togethercomputer model (“V0”). The V1 under hazyresearch has been fine tuned on more diverse data, so performance may vary between those two models.

Are v0 and v1 the same architecture but trained on different data? Is the private v2 from the GitHub docs also the same architecture trained on different data?

Thank you, very interesting model.

@DanFu09
Copy link
Collaborator

DanFu09 commented Jan 3, 2025 via email

@Huguet57
Copy link
Author

Huguet57 commented Jan 3, 2025

With v2 I am referring at the togethercomputer/m2-bert-80M-32k-retrieval-v2 version mentioned in the EMBEDDINGS.md various times such as in:

m2/bert/EMBEDDINGS.md

Lines 69 to 73 in 7d359e8

python embed_text.py --text "hello world" --model-name togethercomputer/m2-bert-80M-2k-retrieval-v2 --together-api
python embed_text.py --text "hello world" --model-name togethercomputer/m2-bert-80M-8k-retrieval-v2 --together-api
python embed_text.py --text "hello world" --model-name togethercomputer/m2-bert-80M-32k-retrieval-v2 --together-api

Maybe it is a typo!

@DanFu09
Copy link
Collaborator

DanFu09 commented Jan 3, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants