Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve][doc] update serve vllm openai example for latest vllm version #50192

Merged
merged 9 commits into from
Feb 6, 2025

Conversation

erictang000
Copy link
Contributor

Why are these changes needed?

https://docs.ray.io/en/latest/serve/tutorials/vllm-example.html - Currently doesn't work out of the box for the latest vllm versions.

Related issue number

N/A

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@erictang000
Copy link
Contributor Author

redoing #50047 but without pinging like 30 people

@@ -53,6 +55,7 @@ def __init__(
self.prompt_adapters = prompt_adapters
self.request_logger = request_logger
self.chat_template = chat_template
print(f"{ray.util.get_current_placement_group()=}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this debug statement as well as import ray :)

Signed-off-by: Eric Tang <[email protected]>
Copy link
Contributor

@GeneDer GeneDer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@GeneDer GeneDer added the go add ONLY when ready to merge, run all tests label Feb 3, 2025
self.engine,
model_config,
served_model_names,
self.response_role,
[BaseModelPath(name=self.engine_args.model, model_path="./")],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into the same thing yesterday and fixed it by setting name=self.engine_args.model, model_path=self.engine_args.model, curious if you know what the difference is and what model_path is supposed to be used for :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point yeah, wasn't sure before so was just setting to "./" - updated to do model_path=self.engine_args.model, seems like what they're using in the internal vllm tests (i.e. here). model_path doesn't seem to be used in this class for anything except here, maybe just as a backup for when served_model_name is separately specified from model so both are accessible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think model_path is where the model is being loaded from actually, and name is what model= in the query API will be mapped to, does that sound right? In which case I think you are doing the right thing by making them the same in most cases (and the user could make them different if they e.g. want to expose some model stored locally or on S3 via a user recognizable name) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see yeah makes sense!

@@ -190,4 +192,4 @@ def build_app(cli_args: Dict[str, str]) -> serve.Application:
for chat in chat_completion:
if chat.choices[0].delta.content is not None:
print(chat.choices[0].delta.content, end="")
# __query_example_end__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline at the end

Copy link
Contributor

@pcmoritz pcmoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks for fixing this ❤️

@pcmoritz
Copy link
Contributor

pcmoritz commented Feb 5, 2025

You need to fix some linting errors before this can be merged: https://buildkite.com/ray-project/microcheck/builds/10708#0194cd42-7138-4c3c-9795-04424c32c7f9

Signed-off-by: Eric Tang <[email protected]>
Signed-off-by: Eric Tang <[email protected]>
@pcmoritz pcmoritz enabled auto-merge (squash) February 6, 2025 01:08
@pcmoritz pcmoritz merged commit f336c59 into ray-project:master Feb 6, 2025
6 checks passed
@pcmoritz
Copy link
Contributor

pcmoritz commented Feb 6, 2025

I tested it out and this is working now. Can you also add a test going forward (separate PR)? I believe @aslonnie and @comaniac have been working on getting the required dependencies in, so you should be able to set up a test now (cc @akshay-anyscale)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants