How to access `togethercomputer/m2-bert-80M-32k-retrieval` locally? #42

Huguet57 · 2025-01-02T14:47:57Z

I am comparing the performance of three different approaches for the togethercomputer/m2-bert-80M-32k-retrieval model. Below are the results from each method. Based on the outcomes, the TogetherComputer API provides the best results. I need guidance on how to access this model locally.

Performance Comparison

1. TogetherComputer API

Code:

response = client.embeddings.create(
    model="togethercomputer/m2-bert-80M-32k-retrieval", input=text
)

Results:

Running comparison for version: api

Self-comparison check:
----------------------------------------
query.pt     | 1.0000 | 100.00%

Comparing with target embeddings:
----------------------------------------
target1.pt   | 0.8149 | 81.49%
target2.pt   | 0.5886 | 58.86%
target3.pt   | 0.3603 | 36.03%

Comparing with random embeddings:
----------------------------------------
random1.pt   | 0.0875 | 8.75%
random2.pt   | 0.0437 | 4.37%
random3.pt   | 0.0594 | 5.94%
random4.pt   | 0.0348 | 3.48%

2. HuggingFace

Code:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    "togethercomputer/m2-bert-80M-32k-retrieval",
    trust_remote_code=True,
)

Results:

Running comparison for version: hf

Self-comparison check:
----------------------------------------
query.pt     | 1.0000 | 100.00%

Comparing with target embeddings:
----------------------------------------
target1.pt   | 0.6615 | 66.15%
target2.pt   | 0.4166 | 41.66%
target3.pt   | 0.2371 | 23.71%

Comparing with random embeddings:
----------------------------------------
random1.pt   | 0.1839 | 18.39%
random2.pt   | 0.0325 | 3.25%
random3.pt   | 0.0992 | 9.92%
random4.pt   | 0.0709 | 7.09%

3. HuggingFace v1

Code:

from transformers import BertConfig, AutoModelForMaskedLM

config = BertConfig.from_pretrained("hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1")
model = AutoModelForMaskedLM.from_pretrained(
    "hazyresearch/M2-BERT-32K-Retrieval-Encoder-V1",
    config=config,
    trust_remote_code=True,
)

Results:

Running comparison for version: v1

Self-comparison check:
----------------------------------------
query.pt     | 1.0000 | 100.00%

Comparing with target embeddings:
----------------------------------------
target1.pt   | 0.4759 | 47.59%
target2.pt   | 0.5822 | 58.22%
target3.pt   | 0.4253 | 42.53%

Comparing with random embeddings:
----------------------------------------
random1.pt   | 0.0892 | 8.92%
random2.pt   | 0.2413 | 24.13%
random3.pt   | 0.2485 | 24.85%
random4.pt   | -0.0903 | -9.03%

Request for Help

The TogetherComputer API provides the best results in terms of similarity scores. However, I would like to access the same model locally for further experimentation and to avoid API dependency.

Question:
How can I download and use the togethercomputer/m2-bert-80M-32k-retrieval model locally, while achieving the same level of performance as with the TogetherComputer API?

Thank you!

The text was updated successfully, but these errors were encountered:

DanFu09 · 2025-01-02T15:01:11Z

Give this script a try: https://github.com/HazyResearch/m2/blob/main/bert/embed_text.py

It prints out the values of the embedding, it should be the same between local and the API.

The API is serving the togethercomputer model (“V0”). The V1 under hazyresearch has been fine tuned on more diverse data, so performance may vary between those two models.

Huguet57 · 2025-01-03T10:06:50Z

It was the padding as @ezorita suggested in #43 , closing!

Huguet57 · 2025-01-03T10:11:08Z

Thank you for the fast responses, @DanFu09!

The API is serving the togethercomputer model (“V0”). The V1 under hazyresearch has been fine tuned on more diverse data, so performance may vary between those two models.

Are v0 and v1 the same architecture but trained on different data? Is the private v2 from the GitHub docs also the same architecture trained on different data?

Thank you, very interesting model.

DanFu09 · 2025-01-03T10:14:18Z

It’s the same architecture between v0 and v1, but slightly different data and loss function. I’m not aware of what “v2” means :) The v1 in the together api URL refers to the overall API spec, not any particular model.

…

On Fri, Jan 3, 2025 at 11:11 AM Andreu Huguet ***@***.***> wrote: Thank you for the fast responses, @DanFu09 <https://github.com/DanFu09>! The API is serving the togethercomputer model (“V0”). The V1 under hazyresearch has been fine tuned on more diverse data, so performance may vary between those two models. Are v0 and v1 the same architecture but trained on different data? Is the private v2 from the GitHub docs <https://github.com/HazyResearch/m2/blob/main/bert/EMBEDDINGS.md#generating-embeddings> also the same architecture trained on different data? Thank you, very interesting model. — Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDDIITII6B3FZHWHZ5J64T2IZO5FAVCNFSM6AAAAABUQCXRB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRYHE4DKNRQGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Huguet57 · 2025-01-03T10:23:31Z

With v2 I am referring at the togethercomputer/m2-bert-80M-32k-retrieval-v2 version mentioned in the EMBEDDINGS.md various times such as in:

m2/bert/EMBEDDINGS.md

Lines 69 to 73 in 7d359e8

    
           python embed_text.py --text "hello world" --model-name togethercomputer/m2-bert-80M-2k-retrieval-v2 --together-api 
        
           python embed_text.py --text "hello world" --model-name togethercomputer/m2-bert-80M-8k-retrieval-v2 --together-api 
        
           python embed_text.py --text "hello world" --model-name togethercomputer/m2-bert-80M-32k-retrieval-v2 --together-api

Maybe it is a typo!

DanFu09 · 2025-01-03T10:26:24Z

Ah, thats a typo! At some point we internally switched “v2” to “v1” so probably things are leftover from that.

…

On Fri, Jan 3, 2025 at 11:23 AM Andreu Huguet ***@***.***> wrote: With v2 I am referring at the togethercomputer/m2-bert-80M-32k-retrieval-v2 version mentioned in the EMBEDDINGS.md various times such in: https://github.com/HazyResearch/m2/blob/7d359e8ce0d18294d840d1ed775ec920b3b48afd/bert/EMBEDDINGS.md?plain=1#L69-L73 Maybe it is a typo! — Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDDIIXRTP4ID6W4ZCTWXFL2IZQLTAVCNFSM6AAAAABUQCXRB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRZGAYDCNZSHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Huguet57 changed the title ~~How to Access togethercomputer/m2-bert-80M-32k-retrieval Locally?~~ How to access togethercomputer/m2-bert-80M-32k-retrieval locally? Jan 2, 2025

ezorita mentioned this issue Jan 3, 2025

[Question] Padding influences embedding #43

Closed

Huguet57 closed this as completed Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to access `togethercomputer/m2-bert-80M-32k-retrieval` locally? #42

How to access `togethercomputer/m2-bert-80M-32k-retrieval` locally? #42

Huguet57 commented Jan 2, 2025

DanFu09 commented Jan 2, 2025

Huguet57 commented Jan 3, 2025

Huguet57 commented Jan 3, 2025

DanFu09 commented Jan 3, 2025 via email

Huguet57 commented Jan 3, 2025 •

edited

Loading

DanFu09 commented Jan 3, 2025 via email

How to access togethercomputer/m2-bert-80M-32k-retrieval locally? #42

How to access togethercomputer/m2-bert-80M-32k-retrieval locally? #42

Comments

Huguet57 commented Jan 2, 2025

Performance Comparison

1. TogetherComputer API

Code:

Results:

2. HuggingFace

Code:

Results:

3. HuggingFace v1

Code:

Results:

Request for Help

DanFu09 commented Jan 2, 2025

Huguet57 commented Jan 3, 2025

Huguet57 commented Jan 3, 2025

DanFu09 commented Jan 3, 2025 via email

Huguet57 commented Jan 3, 2025 • edited Loading

DanFu09 commented Jan 3, 2025 via email

How to access `togethercomputer/m2-bert-80M-32k-retrieval` locally? #42

How to access `togethercomputer/m2-bert-80M-32k-retrieval` locally? #42

Huguet57 commented Jan 3, 2025 •

edited

Loading