Skip to content

Conversation

namanlalitnyu
Copy link

@namanlalitnyu namanlalitnyu commented Aug 20, 2025

Changes:
This PR contains the basic workflow for SGLang framework run using the vLLM backend:

  1. Where we leverage the run-nightly-performance.sh bash script configured to use SGLang as serving engine.
  2. This PR runs the nightly and genai-perf tests which are run by default using the nightly-benchmarks script.
  3. The results from this workflow are currently dumped into the artifacts, but will be implementing the logic to upload them to AWS S3 in the coming PRs.
  4. We have only used H100 as runner for testing the basic implementation.

We will be integrating the following TODOs in this workflow in the coming PRs:

  • Implement check for fetching the last released tags/versions for SGLang.
  • Implement the logic for uploading the benchmarking results to S3.
  • Enable the workflow for all runners (H100, A100, B200).

Testing

  1. The workflow ran successfully with all the jobs and the sglang server.
Screenshot 2025-08-20 at 10 55 49 PM Screenshot 2025-08-20 at 10 57 20 PM
  1. The artifacts got generated successfully.
Screenshot 2025-08-20 at 10 51 56 PM

@namanlalitnyu namanlalitnyu deployed to pytorch-x-vllm August 25, 2025 23:15 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant