Support batch procesing for openai api compatible requests #659

ravi03071991 · 2025-04-29T16:58:24Z

PR to support batch processing with OpenAI API Compatible requests.

Currently, we are processing with a batch_size=1 and this PR helps to compute metrics with different batch sizes.

mickqian · 2025-05-06T11:49:01Z

@Luodian @kcz358 Could you please take a look? Thanks!

kcz358

Hi @mickqian @ravi03071991 , thank you for your contribution. Do you guys think it is more appropriate to put the changes in this file

https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/main/lmms_eval/models/batch_gpt4.py

instead of the open_compatible.py one? Because when we use this file, it is possible that we are testing some self-hosted server such as using vllm or sglang or any openai compatible which may not necessarily implement the batch api

ravi03071991 · 2025-05-07T06:03:48Z

Hi @mickqian @ravi03071991 , thank you for your contribution. Do you guys think it is more appropriate to put the changes in this file

https://github.com/EvolvingLMMs-Lab/lmms-eval/blob/main/lmms_eval/models/batch_gpt4.py

instead of the open_compatible.py one? Because when we use this file, it is possible that we are testing some self-hosted server such as using vllm or sglang or any openai compatible which may not necessarily implement the batch api

Thanks @kcz358. Are you suggesting that we create a new model file called batch_openai.py?

Alternatively, we could update the existing srt_api.py model to support batch API requests, since we’re specifically testing it for sglang, and sglang supports batch requests through the OpenAI API client.

ravi03071991 · 2025-05-07T08:03:59Z

Also, @kcz358, the OpenAI client supports batch requests So any self-hosted serving solution—like vLLM or sglang—that is OpenAI-compatible should support batch requests by default.

kcz358 · 2025-05-07T13:02:54Z

I think vllm lacks of an endpoint v1/files to be able to run the

batch_input_file = client.files.create(
    file=open("batchinput.jsonl", "rb"),
    purpose="batch"
)

I kind of investigate this a while ago and I just tried again with vllm and found this still does not work. I remember that this works with sglang because they create that endpoint.

So I believe the best way is still change the code in lmms_eval/models/batch_gpt4.py to separate the difference as it actually points to different endpoint.

ravi03071991 · 2025-05-07T14:13:57Z

I think vllm lacks of an endpoint v1/files to be able to run the
batch_input_file = client.files.create(
    file=open("batchinput.jsonl", "rb"),
    purpose="batch"
)
I kind of investigate this a while ago and I just tried again with vllm and found this still does not work. I remember that this works with sglang because they create that endpoint.

So I believe the best way is still change the code in lmms_eval/models/batch_gpt4.py to separate the difference as it actually points to different endpoint.

Yeah, that makes sense. I just tested it and realized it’s not supported on their end. I’ll go ahead and update the code in lmms_eval/models/batch_gpt4.py accordingly. Thanks @kcz358.

ravi03071991 · 2025-05-09T09:28:48Z

Hi @kcz358 ,

The default output format from the OpenAI Batch API seems quite different from the SGLang OpenAI client batch output.

You can check the OpenAI Batch output here.

SGLang OpenAI client batch output:

Response: {‘status_code’: 200, ‘request_id’: ‘batch_f6ea6fef-e9fe-4de5-944d-92d25efef3d9-req_0’, ‘body’: {‘id’: ‘batch_f6ea6fef-e9fe-4de5-944d-92d25efef3d9-req_0’, ‘object’: ‘chat.completion’, ‘created’: 1746752343, ‘model’: ‘qwen/qwen2.5-0.5b-instruct’, ‘choices’: {‘index’: 0, ‘message’: {‘role’: ‘assistant’, ‘content’: “Sure, here is a programming joke for you:\nWhy couldn’t the code always stay happy when it ran?\nBecause it always had to wait for the programmer to give it a smiley face!”, ‘tool_calls’: None, ‘reasoning_content’: None}, ‘logprobs’: None, ‘finish_reason’: ‘stop’, ‘matched_stop’: 151645}, ‘usage’: {‘prompt_tokens’: 35, ‘completion_tokens’: 40, ‘total_tokens’: 75}, ‘system_fingerprint’: None}}

To extract the result, we currently need to do something like:
item['response']['body']['choices']['message']['content']

Just wondering — would it make sense to update batch_gpt4.py to align with this format?

kcz358 · 2025-05-11T13:18:22Z

Hi, @ravi03071991 do you think it is okay to do an if else check here in batch_gpt4.py? I am not sure if this is the best options though

ravi03071991 · 2025-05-13T14:04:25Z

Hi, @ravi03071991 do you think it is okay to do an if else check here in batch_gpt4.py? I am not sure if this is the best options though

Yeah, I think so. I raised a PR on sglang to fix it. Once its merged, I will update this PR accordingly.

mickqian · 2025-05-18T11:42:01Z

@ravi03071991 Hi ravi, can you link your pr to here?

ravi03071991 · 2025-05-18T16:11:10Z

@ravi03071991 Hi ravi, can you link your pr to here?

PR - Fixes batch with single request error
PR - Fixes the batch response output format.

Ravi Theja Desetty added 2 commits April 29, 2025 22:25

Support batch procesing for openai api compatible requests

adc0762

fix lints

b7eb773

kcz358 reviewed May 7, 2025

View reviewed changes

mickqian mentioned this pull request May 18, 2025

CI: improve vlm mmmu test sgl-project/sglang#5821

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support batch procesing for openai api compatible requests #659

Support batch procesing for openai api compatible requests #659

Uh oh!

ravi03071991 commented Apr 29, 2025 •

edited

Loading

Uh oh!

mickqian commented May 6, 2025

Uh oh!

kcz358 left a comment

Uh oh!

ravi03071991 commented May 7, 2025

Uh oh!

ravi03071991 commented May 7, 2025

Uh oh!

kcz358 commented May 7, 2025 •

edited

Loading

Uh oh!

ravi03071991 commented May 7, 2025

Uh oh!

ravi03071991 commented May 9, 2025

Uh oh!

kcz358 commented May 11, 2025

Uh oh!

ravi03071991 commented May 13, 2025

Uh oh!

mickqian commented May 18, 2025 •

edited

Loading

Uh oh!

ravi03071991 commented May 18, 2025

Uh oh!

Uh oh!

Support batch procesing for openai api compatible requests #659

Are you sure you want to change the base?

Support batch procesing for openai api compatible requests #659

Uh oh!

Conversation

ravi03071991 commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mickqian commented May 6, 2025

Uh oh!

kcz358 left a comment

Choose a reason for hiding this comment

Uh oh!

ravi03071991 commented May 7, 2025

Uh oh!

ravi03071991 commented May 7, 2025

Uh oh!

kcz358 commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ravi03071991 commented May 7, 2025

Uh oh!

ravi03071991 commented May 9, 2025

Uh oh!

kcz358 commented May 11, 2025

Uh oh!

ravi03071991 commented May 13, 2025

Uh oh!

mickqian commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ravi03071991 commented May 18, 2025

Uh oh!

Uh oh!

ravi03071991 commented Apr 29, 2025 •

edited

Loading

kcz358 commented May 7, 2025 •

edited

Loading

mickqian commented May 18, 2025 •

edited

Loading