[serve][doc] update serve vllm openai example for latest vllm version #50192

erictang000 · 2025-02-03T18:31:30Z

Why are these changes needed?

https://docs.ray.io/en/latest/serve/tutorials/vllm-example.html - Currently doesn't work out of the box for the latest vllm versions.

Related issue number

N/A

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Eric Tang <[email protected]>

erictang000 · 2025-02-03T18:32:42Z

redoing #50047 but without pinging like 30 people

Signed-off-by: Eric Tang <[email protected]>

GeneDer · 2025-02-03T19:13:08Z

doc/source/serve/doc_code/vllm_openai_example.py

@@ -53,6 +55,7 @@ def __init__(
        self.prompt_adapters = prompt_adapters
        self.request_logger = request_logger
        self.chat_template = chat_template
+        print(f"{ray.util.get_current_placement_group()=}")


Let's remove this debug statement as well as import ray :)

Signed-off-by: Eric Tang <[email protected]>

GeneDer

pcmoritz · 2025-02-05T22:07:40Z

doc/source/serve/doc_code/vllm_openai_example.py

                self.engine,
                model_config,
-                served_model_names,
-                self.response_role,
+                [BaseModelPath(name=self.engine_args.model, model_path="./")],


I ran into the same thing yesterday and fixed it by setting name=self.engine_args.model, model_path=self.engine_args.model, curious if you know what the difference is and what model_path is supposed to be used for :)

good point yeah, wasn't sure before so was just setting to "./" - updated to do model_path=self.engine_args.model, seems like what they're using in the internal vllm tests (i.e. here). model_path doesn't seem to be used in this class for anything except here, maybe just as a backup for when served_model_name is separately specified from model so both are accessible

I think model_path is where the model is being loaded from actually, and name is what model= in the query API will be mapped to, does that sound right? In which case I think you are doing the right thing by making them the same in most cases (and the user could make them different if they e.g. want to expose some model stored locally or on S3 via a user recognizable name) :)

i see yeah makes sense!

pcmoritz · 2025-02-05T22:07:50Z

doc/source/serve/doc_code/vllm_openai_example.py

@@ -190,4 +192,4 @@ def build_app(cli_args: Dict[str, str]) -> serve.Application:
    for chat in chat_completion:
        if chat.choices[0].delta.content is not None:
            print(chat.choices[0].delta.content, end="")
-    # __query_example_end__


Add newline at the end

pcmoritz

Very nice, thanks for fixing this ❤️

pcmoritz · 2025-02-05T22:12:28Z

You need to fix some linting errors before this can be merged: https://buildkite.com/ray-project/microcheck/builds/10708#0194cd42-7138-4c3c-9795-04424c32c7f9

Signed-off-by: Eric Tang <[email protected]>

pcmoritz · 2025-02-06T01:12:25Z

I tested it out and this is working now. Can you also add a test going forward (separate PR)? I believe @aslonnie and @comaniac have been working on getting the required dependencies in, so you should be able to set up a test now (cc @akshay-anyscale)

update vllm serve example to work

bb45dcd

Signed-off-by: Eric Tang <[email protected]>

erictang000 requested review from edoakes, zcin, GeneDer, akshay-anyscale and a team as code owners February 3, 2025 18:31

erictang000 added 5 commits February 3, 2025 10:52

update vllm serve example again

5c5c61c

Signed-off-by: Eric Tang <[email protected]>

get rid of lora_modules since it was refactored out

9129536

Signed-off-by: Eric Tang <[email protected]>

add back main

6caace1

Signed-off-by: Eric Tang <[email protected]>

add back begin

051e53a

Signed-off-by: Eric Tang <[email protected]>

add back lora modules to openaiservingmodels

49b5056

Signed-off-by: Eric Tang <[email protected]>

GeneDer reviewed Feb 3, 2025

View reviewed changes

remove import + comment

72ec9a1

Signed-off-by: Eric Tang <[email protected]>

GeneDer approved these changes Feb 3, 2025

View reviewed changes

GeneDer added the go add ONLY when ready to merge, run all tests label Feb 3, 2025

pcmoritz reviewed Feb 5, 2025

View reviewed changes

pcmoritz approved these changes Feb 5, 2025

View reviewed changes

pcmoritz mentioned this pull request Feb 5, 2025

[Ray Serve vLLM example] 'LoRAModulePath' Cannot be found for example #50260

Closed

erictang000 added 2 commits February 5, 2025 16:15

run linter

30e9027

Signed-off-by: Eric Tang <[email protected]>

change basemodelpath args

7280d81

Signed-off-by: Eric Tang <[email protected]>

pcmoritz approved these changes Feb 6, 2025

View reviewed changes

pcmoritz enabled auto-merge (squash) February 6, 2025 01:08

pcmoritz merged commit f336c59 into ray-project:master Feb 6, 2025
6 checks passed

This was referenced Feb 6, 2025

[<Ray component: Core|RLlib|etc...>] The ray serve using vLLM example on the website does not work. #50275

Open

[DOC] Vllm example is not work #45739

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[serve][doc] update serve vllm openai example for latest vllm version #50192

[serve][doc] update serve vllm openai example for latest vllm version #50192

erictang000 commented Feb 3, 2025

erictang000 commented Feb 3, 2025

GeneDer Feb 3, 2025

GeneDer left a comment

pcmoritz Feb 5, 2025

erictang000 Feb 6, 2025

pcmoritz Feb 6, 2025

erictang000 Feb 6, 2025

pcmoritz Feb 5, 2025

pcmoritz left a comment

pcmoritz commented Feb 5, 2025

pcmoritz commented Feb 6, 2025

[serve][doc] update serve vllm openai example for latest vllm version #50192

[serve][doc] update serve vllm openai example for latest vllm version #50192

Conversation

erictang000 commented Feb 3, 2025

Why are these changes needed?

Related issue number

Checks

erictang000 commented Feb 3, 2025

GeneDer Feb 3, 2025

Choose a reason for hiding this comment

GeneDer left a comment

Choose a reason for hiding this comment

pcmoritz Feb 5, 2025

Choose a reason for hiding this comment

erictang000 Feb 6, 2025

Choose a reason for hiding this comment

pcmoritz Feb 6, 2025

Choose a reason for hiding this comment

erictang000 Feb 6, 2025

Choose a reason for hiding this comment

pcmoritz Feb 5, 2025

Choose a reason for hiding this comment

pcmoritz left a comment

Choose a reason for hiding this comment

pcmoritz commented Feb 5, 2025

pcmoritz commented Feb 6, 2025