Skip to content

Adding v0.4.0 vllm, sglang and dynamo test yamls #1352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 30, 2025

Conversation

nwangfw
Copy link
Collaborator

@nwangfw nwangfw commented Jul 30, 2025

Pull Request Description

Adding test sglang, vllm, and dynamo yamls

Related Issues

Resolves: #[1336]

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @nwangfw, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the testing infrastructure by adding new Kubernetes YAML configurations for deploying and testing disaggregated large language model (LLM) inference systems across Dynamo, SGLang, and VLLM frameworks. It also updates existing SGLang configurations to use a new disaggregation transfer backend, ensuring broader compatibility and improved performance testing.

Highlights

  • New Test Configurations: Added new YAML configurations for testing various deployment scenarios for SGLang, VLLM, and Dynamo, expanding the test suite for large language model (LLM) inference systems.
  • Dynamo Platform Deployment: Introduced comprehensive Kubernetes manifests (namespace.yaml, platform.yaml) to deploy the core 'dynamo-platform' components, including etcd, NATS, and the dynamo-operator, providing a foundational setup for Dynamo-based LLM deployments.
  • Disaggregated LLM Deployments (Dynamo): Included new DynamoGraphDeployment YAMLs (disagg.yaml, disagg_router.yaml) that define disaggregated LLM inference setups with distinct Frontend, Processor, VllmWorker, PrefillWorker, and Router services, specifying resource requirements and model download procedures.
  • SGLang Router Configurations: Added multiple StormService YAMLs for SGLang (sglang-router-1p1d-tp2.yaml, sglang-router-1p1d.yaml, sglang-router-4p3d.yaml) to test various prefill/decode replica and tensor parallelism configurations, leveraging RDMA for inter-pod communication.
  • SGLang Disaggregation Backend Update: Modified existing SGLang configurations (sglang-router-cache.yaml, sglang-router-random.yaml, aibrix-router.yaml) to switch the disaggregation-transfer-backend from nixl to mooncake, indicating an update in the underlying data transfer mechanism.
  • VLLM Disaggregated Deployment: Added a StormService YAML (1p1d.yaml) for VLLM, configuring a 1 prefill + 1 decode replica setup, utilizing the NixlConnector for KV transfer and RDMA for high-performance communication.
  • Kubernetes RBAC for SGLang: Introduced Kubernetes RBAC configurations (rbac-sglang-router.yaml) to grant necessary permissions (get, watch, list pods) to the default service account for SGLang router functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a number of YAML files for testing sglang, vllm, and dynamo. While the additions are comprehensive, I've identified several critical security vulnerabilities, such as hardcoded secrets and overly permissive RBAC roles, that must be addressed. There are also correctness issues that would cause deployment failures and maintainability concerns like using hostPath volumes and installing packages at runtime. Please see the detailed comments for suggestions on how to resolve these issues.

app.kubernetes.io/managed-by: Helm
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: ewogICAgImF1dGhzIjogewogICAgICAiYWlicml4LWNvbnRhaW5lci1yZWdpc3RyeS1jbi1iZWlqaW5nLmNyLnZvbGNlcy5jb20vYWlicml4IjogewogICAgICAgICJ1c2VybmFtZSI6ICJuaW5nLndhbmcxQGJ5dGVkYW5jZS5jb21AMzcyNjA5MDYiLAogICAgICAgICJwYXNzd29yZCI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNklsRk9WalU2U2xWRFZqcEhUa1pXT2xjMlRFYzZSVnBKU3pwYVVqWldPa0ZLVURNNlZWSXlUanBQTjAxTE9sVmFOMGc2TlZWR1V6cFpRMWxXSW4wLmV5SnBjM01pT2lKamNpMXBjM04xWlhJaUxDSnpkV0lpT2lKdWFXNW5MbmRoYm1jeFFHSjVkR1ZrWVc1alpTNWpiMjFBTXpjeU5qQTVNRFlpTENKaGRXUWlPbHNpWTNJaVhTd2laWGh3SWpveE56VXpOelV5TmpVMExDSnVZbVlpT2pFM05UTTNORGt3TlRRc0ltbGhkQ0k2TVRjMU16YzBPVEExTkN3aWFuUnBJam9pWlRkS1kyTkNSa2RPV2tSME1YUkZTU0lzSW1sa1pXNTBhWFI1WDNSNWNHVWlPaUoxYzJWeUlpd2lhV1JsYm5ScGRIbGZZV05qYjNWdWRGOXBaQ0k2TWpFd01EWTVNalF5T0N3aWFXUmxiblJwZEhsZmRYTmxjbDlwWkNJNk16Y3lOakE1TURZc0ltbHViR2x1WlY5d2IyeHBZM2tpT2lKSU5ITkpRVUZCUVVGQlFVTXZObkYxUWxGUlFVRlFMeTlSTnl0dGIzZEpRVUZCUVNKOS5RenR5eS1iSlBoZUlfWFJ5VUVCS1llXzJzNGUwdGpFMTFtRnBET0M5RVBja1hHOENZQXdXYXR4ZnRJeGFZQTVEalAxc3RZVUwxS2lKdkxDNWNDX0hqZDFTSDB6NW45bTZJR1Z0c2pSTEt1dlQtSS16Q1B2dnZhTEdzTTNKS3BUdDhRemhPcFplUlZHVnFhNXNJS2lpQjFDUVVDUDByS013VHZSUElaQldoN3V5TEdNVk00VXFlUkpsLUFYeDRHMjhmWDhwTF9JX2RMQzdlTkZRYl96M3hfNWFIZzNkbVFYcTVOQzB3dVIxR1YwRTliU1ppSDZwZ0pyVHZrU0ZMNkcwOHlRd2EyQ2M5Y1pvNGhGUWR2MTB2SnAtYWg2WkdkSi1ISFpMTlprNkplZXlzYWNEX3FVa01nQTY1WFNpSW9DbFNRalF1LU5hWjZuNmZ3Z0NPd0dHMnM4cF9HMTM3WDRIWnVILXRDX2Rnc1VKbDhhdk02VnFNenU4ZUZzb1l0RFVOaFNLU3FBZDlBZWo1T1IzRGNTMWRid19lQm5OZ3cwTnhxVFJsVEdkWDZPRTk2MFRoOW5qZUhTYVVfR2g5cTZPQWV3anptRWVGMzNqc2xMYWVxa2xVVnV2Q2xWOXl0UzVOV3Z4NzVzSmRjSU9LUVlHMlo2U29GNUNtNWtuczlOZjlYRlM0Q1lSV1ZzUGxSSVdEWDVvWFB3c1AtUXNyUmZKek93NlNSN2pueXloX2JWQi1kUWJFcnlMY1BESkcxQVpGaEx6TzR1VHhGRVpLdHkySF9JWW5TMU4yN0Q0UHdJdEVnNlROaEN0RWJLa0U2TFBNajVyZHB1bWRrOWNVMFJUWU1XQy1WWFdxcHp0aUZsR3dZXzRCNlpxTVhzR3o0ekVfb3ZjQWIwcHRyZyIsCiAgICAgICAgImF1dGgiOiAiYm1sdVp5NTNZVzVuTVVCaWVYUmxaR0Z1WTJVdVkyOXRRRE0zTWpZd09UQTJPbVY1U2pCbFdFRnBUMmxLUzFZeFVXbE1RMHBvWWtkamFVOXBTbE5WZWtreFRtbEpjMGx0ZEhCYVEwazJTV3hHVDFacVZUWlRiRlpFVm1wd1NGUnJXbGRQYkdNeVZFVmpObEpXY0VwVGVuQmhWV3BhVjA5clJrdFZSRTAyVmxaSmVWUnFjRkJPTURGTVQyeFdZVTR3WnpaT1ZsWkhWWHB3V2xFeGJGZEpiakF1WlhsS2NHTXpUV2xQYVVwcVkya3hjR016VGpGYVdFbHBURU5LZW1SWFNXbFBhVXAxWVZjMWJreHVaR2hpYldONFVVZEtOV1JIVm10WlZ6VnFXbE0xYW1JeU1VRk5lbU41VG1wQk5VMUVXV2xNUTBwb1pGZFJhVTlzYzJsWk0wbHBXRk4zYVZwWWFIZEphbTk0VG5wVmVrNTZWWGxPYWxVd1RFTktkVmx0V1dsUGFrVXpUbFJOTTA1RWEzZE9WRkZ6U1cxc2FHUkRTVFpOVkdNeFRYcGpNRTlVUVRGT1EzZHBZVzVTY0VscWIybGFWR1JMV1RKT1ExSnJaRTlYYTFJd1RWaFNSbE5UU1hOSmJXeHJXbGMxTUdGWVVqVllNMUkxWTBkVmFVOXBTakZqTWxaNVNXbDNhV0ZYVW14aWJsSndaRWhzWmxsWFRtcGlNMVoxWkVZNWNGcERTVFpOYWtWM1RVUlpOVTFxVVhsUFEzZHBZVmRTYkdKdVVuQmtTR3htWkZoT2JHTnNPWEJhUTBrMlRYcGplVTVxUVRWTlJGbHpTVzFzZFdKSGJIVmFWamwzWWpKNGNGa3phMmxQYVVwSlRraE9TbEZWUmtKUlZVWkNVVlZOZGs1dVJqRlJiRVpTVVZWR1VVeDVPVkpPZVhSMFlqTmtTbEZWUmtKUlUwbzVMbEY2ZEhsNUxXSktVR2hsU1Y5WVVubFZSVUpMV1dWZk1uTTBaVEIwYWtVeE1XMUdjRVJQUXpsRlVHTnJXRWM0UTFsQmQxZGhkSGhtZEVsNFlWbEJOVVJxVURGemRGbFZUREZMYVVwMlRFTTFZME5mU0dwa01WTklNSG8xYmpsdE5rbEhWblJ6YWxKTVMzVjJWQzFKTFhwRFVIWjJkbUZNUjNOTk0wcExjRlIwT0ZGNmFFOXdXbVZTVmtkV2NXRTFjMGxMYVdsQ01VTlJWVU5RTUhKTFRYZFVkbEpRU1ZwQ1YyZzNkWGxNUjAxV1RUUlZjV1ZTU213dFFWaDRORWN5T0daWU9IQk1YMGxmWkV4RE4yVk9SbEZpWDNvemVGODFZVWhuTTJSdFVWaHhOVTVETUhkMVVqRkhWakJGT1dKVFdtbElObkJuU25KVWRtdFRSa3cyUnpBNGVWRjNZVEpEWXpsaldtODBhRVpSWkhZeE1IWktjQzFoYURaYVIyUktMVWhJV2t4T1dtczJTbVZsZVhOaFkwUmZjVlZyVFdkQk5qVllVMmxKYjBOc1UxRnFVWFV0VG1GYU5tNDJabmRuUTA5M1IwY3ljemh3WDBjeE16ZFlORWhhZFVndGRFTmZaR2R6VlVwc09HRjJUVFpXY1UxNmRUaGxSbk52V1hSRVZVNW9VMHRUY1VGa09VRmxhalZQVWpORVkxTXhaR0ozWDJWQ2JrNW5kekJPZUhGVVVteFVSMlJZTms5Rk9UWXdWR2c1Ym1wbFNGTmhWVjlIYURseE5rOUJaWGRxZW0xRlpVWXpNMnB6YkV4aFpYRnJiRlZXZFhaRGJGWTVlWFJUTlU1WGRuZzNOWE5LWkdOSlQwdFJXVWN5V2paVGIwWTFRMjAxYTI1ek9VNW1PVmhHVXpSRFdWSlhWbk5RYkZKSlYwUllOVzlZVUhkelVDMVJjM0pTWmtwNlQzYzJVMUkzYW01NWVXaGZZbFpDTFdSUllrVnllVXhqVUVSS1J6RkJXa1pvVEhwUE5IVlVlRVpGV2t0MGVUSklYMGxaYmxNeFRqSTNSRFJRZDBsMFJXYzJWRTVvUTNSRllrdHJSVFpNVUUxcU5YSmtjSFZ0WkdzNVkxVXdVbFJaVFZkRExWWllWM0Z3ZW5ScFJteEhkMWxmTkVJMlduRk5XSE5IZWpSNlJWOXZkbU5CWWpCd2RISm4iCiAgICAgIH0KICAgIH0KICB9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This secret contains hardcoded credentials, which is a critical security vulnerability. Secrets, including registry credentials, must never be stored in version control. Please remove this and manage credentials using a secure method, such as creating Kubernetes secrets out-of-band or using a secret management tool.

helm.sh/chart: etcd-11.1.0
type: Opaque
data:
jwt-token.pem: "LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlKS1FJQkFBS0NBZ0VBdkNCVlVhNFIwL1FpQ2owRmw3Y3h1SzNpTVI5MzkvaXRaOCtHQ3h4NEpPM1RVY2lWCmhhQ3cwUHZKMWlBUFREclVaQ3NDZ1VNSUZDczFtK3dhWHBtcTYyLytEMkVndmNkRXA2bEl2UmdIdXE0SGIzNFkKWXRaKzBlNXVYaEdxWnV5aU05dllFcUNDbGhteXBZQmhvdVEwb0NnU3pLRXlxTzNSNlhDcUk0UFhpbk5aWlhrMwpTNXAwVXNwZ2hEY0NXMVEydnpKNWI1bFVLMzFocXhEdmNBNXhEckNUTGlwSmlQMFQvNzVzT1JTeThEV3UyaTFGClZBS3Y0bXllSHpqQUJZak9NL2h2OWI3VjJJTmpQNEdtQ0ZkUVR1U29ZUlhrdE5jQ0VkbHVlWXhZeHlyMHhqdnAKNWFCT2tJWnFNdUxWUzlTRTlqVk4rYzRaNEtjYlBXTTdOOCtjcVZGbUp1ZlF3SVJoMEd3VDFIbFR6ditwM0h2bQowVDd4R0oydlg5V3JCVTJqZVJmaHFTVVFXMldSejVrcmRIVVR2ZkZjQmkxdnRiQi95UkhzeVA5ZjRaSlZMazBGCnByZW5XVFQ0dHltS2haK1crdjVVZlExbmh6ZlZPRXluYndyVEFiMHRuZXJPeUw0dEhjVk05R0Q4TjZLZGs4OFQKZ3NWTnRvdzlHNGRmU2RYQXlxdWNDRWkzRU14NVN3YmZENzVhMkpmRnkvU3cwc0p4bTZVQ3o1RnBidjFxRFRIdAovaVdVUkt0b1ZCbC9mOENuNUd6YmV6TjJTbkRScGlaUFZuR0lQbVlkQ2NwU1F5Ny9JVmFiTGRGYnVPTEZqdkpTCmRWMTh1UUFQQjA2anVBd1E2VjJJbzhoWlovVGZnV1BIQllFSG9Od3gxdmVSY3hJUW56MUJZZHN6b21VQ0F3RUEKQVFLQ0FnQnU0VlBJTDZPY1hHRVB6Y1dWRlVsK0F5YTFNMG4vWXBlNS91dVBjdmt4TXVlQlpZc1RWcEhBcXhiYwpXTXNMY0FVdEFOaWFIVXNPc0NVSndKQU9QOTRvcERxSEVBVS9OY3EyVXlWbkpsbXY0dERKR3ZYK2sxUHVJZXoxCkRzNXptWGRFdjRmTCtzVGMrVXA5RjN6MGlES29UdFk5dTZqalBrQ2hxV0R1UzgvbmIzTTZSc2FiamU2UHVXNUgKbnpDSmFjcFd5TVlnUzZ2N2NndjF5cWFGVWsyNmpjSkJ6SFRKQS9rY25EeW5MQmNIVTJ5NmRVSk9jTUwvM21zWQpGZGUvcTFGMStqRHJFODV3cllmNnIweVB6LzZxY1VUNDYvRi8wTllNZFM0Uy8vQjNlMDhINDlwSkdFSVdGdE5UCktMVDAyODUrV3c2WmRLdDE2S2c2ZDFrNDNPb0ZNc2RyUWQxN2VJK1VoVkZJaFJ0M1U2Q2lhdWF2MHlRKzZZUysKVGlSK3hJVllmOVl6MXFaU2JYVEQ1aXhlWmFqVEN2UE14am5Vc3VFUnFxSFo3Ym1sM21YU3lKWkdDUDlMd3RVcwpJUVUvcE9lVjBBWDROMURDS2V3Q1plbmhZMVBZZVloeXhscS81Mm1PYXZab0JET01vb1NxbG5UY3Y0NzlqbDFzClo5dnFHK1hUSE9mNERtdk9YOHJ3NVk2MU9xbWtmb25JaDhmRkphK0NUTDE5WEFFVjZkK0x1M2FMeUFFRVFqMXoKYjg5N0hpbzBEcDh5LzlIaGphTTJBODhyTlN5ODlBb1Y2akx2R2lVNVVTL0loaERIZkEyd2hYVHNPOU5POEJzeApvR3NXeW55ZklvVnhRWG5NbUp6dk0zOWdGQUF5bFZ6Z2pJRkhCQmdZWlRKc0pnRUJSUUtDQVFFQTBJQVhhNGhqCkdPN1dGRW9UMUZ0Z0VqQjhyemlnY1FRYWJIbXhuSjQ2RGIyNThYakpnUjZTZVVJNVhIVFZBYTYrTVNDaUNBc3EKcGV2T2FhMldaRzR5NjNSSkpGV0s5MWQ0NWVGQ1NaYVgvZXNTZVpXTXA1T09MSDRCOFdoOEo2RGFWRitpdEtHQwpqa0ViN2t5aTlSZzdUQmR5UFVQeXhnTjBNR1paYmpHOHFUcVZkSTl1dXZudUZzRFd3MDVaQk4rYm10YTJ5eHZJCjFuRUdxdlkxT0xYU0tNcHErQ21JTEZLaGlTWFhxOVc3VzQvTzRwOCt2cHVlaFVqUDFXTVd1aS85V0traVFQWXoKVVhndWI4WDVYemRJVDRBZ0hoRGttTTIva2IyanhKdEFMeWRySkx3YVJOejd1azFVUnN3WjZoUk5hb3pLU2srTgpmVHl5bUduTGJ2OEhnd0tDQVFFQTV2d0NyNWdyd01jWDdBMk5oSE1QRVNnaVdJMko4RW52dlQ2Q2NVSlB2d25YCkFFd0ZDSjhnVmFJM1RqQXpTanJQZHVuRElZMHAyNVdGRER0N2lYVDV1SnZyOHd0dVdyeW5uNGxacGRjc2ZnT2MKbUx4ZHNraHIxMVZvVkFVa1RxcHNuaVNsbkw4OGU4VTh6Um5YTk44OTB1RlJhMUp0Y2dITEltNkZ4NXJNbC9hQgpBL0d4UlZkT09vVWlDSXNpaEtlUDFKY3VqSEZEbVEvQW1wYVpqZmE2MEVlT2xwM2Z2UFJIaWt6SmpQdzJyb0V3CklYQ1RjUzd3QnY1bmpQNUJjMGZpSkVwekpqdTN3Lzd6SDlKcXB6R0JhUU5oS3owc2JGUXZtZTBKZzVLNndCZlEKcVFabzdVS29OOVNRcEZmQkd0NXJRbWhWN3p3MnJmSDlkTGJySE02aDl3S0NBUUJFZ0pFeFIxSW5zdkNySWR0NQovYnVHdmw5NHdrWVVXNkRDMndpN3lGQ1NpeTg0WXRMeGxUWFNtU3QzY1BzNjNrcW1vVzRhN25CSXF6emFBNnBSCnZOTUdBT0szQ1ZQckI5WTAwTm1TaStQTURSdTR6cVd6dVFzUkd5VlFSMS9qTlRvR1ZwV0NlejRGdHVhU1dLT0IKbTBDQU00UFpGU1lDWU1nZnpzQzdMNlFvaFpMdEJPTUFrM1RpRE9HQ3NZOFZ4dEc5QWFaMC9OL3pkSmNZUEdRbwpDYlRPcWhLVERqSXJQY3JNNEo0SFd1Vi9vN1A1U1JDNFZ6Q3g0Y1pHK1pOU0FNcGdBdEc2UGltR0x0SXk3dnlMCmlBSTkvSDZRNGtQNDRFQjhoYTMwdm4xTUFNdDIra1kzTWRPUzFpMUdSWENPTTdWcnVHNE9GQzBNWkpEWStvbFkKSW5vNUFvSUJBUUN4YW13ZHQ0SVR4bGlIaFRsbnZJQ2NLdm1HanVDZVlndjBqYWpmbVpGcktRR3RScmF2RXBwaApiQUdTMnM3cWV4V1RQcVlVaFRyYllKTWEvQzA5UFUzNkVEbEJLMHlPQVUyMjZtYWpUUXp0bFpTWWRFWjdnOGpoClVzYVprb0Y2c3NoUEI5TDAvS1pXWXY3NmdwaWNFY1NJejJ3Y1JPbXZvZFVIRTJGS0dZcUFSbVIrUlF4bU9GTHIKL0FwU0NaeHZ5OHhiKyt0VEVFUmRzbzI3cjJVdWQ0NnlIK1JUcDl1TVpRejJHMWpNSTRWVXNBSHFEWEFTaFh6egp3a0NET1YvSVRqUlNyNlNXZjFQOUJUeDArc2ZVdDkrd2Exb2RQaFY2OTRiQWNpTFduc00yZkpSVzM0ZUVoYWh6CmlaKzN1M2VsVGFXUkVsYS9OTDdEQ0pZbktjQ1pGZFd0QW9JQkFRQzcxRkdybkcxVHQ3bGZObEdYMmlmVzJZcW4KZHhUN25tamVsMlVZNk5idjYxMmcyMGgrQXlSZzJrd1A0WDFVRzljZ05LVkxHZXhnMHN2SU12c1pyR1V3anNkRQpzbXI0SjJQbzZaSmd5c2N1SVI5Vkx0TUxhdXlWQ1RPbjQzVTE1WTNsK3lxVkFPYjlGQkQxQzE4VHVVT2hKMkpXCkhqb2dXRGd2bW1RK1JpZWxuQzlTWXUwZTVWaCt4TDZOaGFSWjhqU1N2SEZrYW1MRUlrSkhROWtxY3dlTnFGUHYKN0lhRjJ0eWhuakY5ZUlpZnlSMGpFMXVNNkFFQ0tlbXBzbVJQNkxFNld6aitaZjN3RDJSaU5SYUxya1U5K1Q4VwpYSFlzejh4RUdldEJ2NmFqSzRrMmNrak0ydllaYkszNlV5c2JYMzhEMVBGeDZrOEZuRXNHT2xPTW9ycEQKLS0tLS1FTkQgUlNBIFBSSVZBVEUgS0VZLS0tLS0K"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This secret contains a hardcoded RSA private key. This is a critical security vulnerability. Private keys must never be committed to version control. Please remove this and handle key management securely.

wget https://tos-tools.tos-cn-beijing.volces.com/linux/amd64/tosutil
chmod +x tosutil
./tosutil config -i <YOUR_ACCESS_KEY_ID> -k <YOUR_SECRET_ACCESS_KEY> -e tos-cn-beijing.ivolces.com -re cn-beijing
./tosutil cp tos://aibrix-artifact-testing/models/Qwen3-8B ./models -r -p 8 -j 32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The container command includes placeholders for access keys. This is a security risk as it encourages hardcoding secrets. These credentials should be provided securely, for example, by mounting them as environment variables from a Kubernetes secret.

This issue is repeated for VllmWorker (line 109) and PrefillWorker (line 142). Consider using envFrom with a secretRef to inject these values securely.

resources:
- '*'
verbs:
- '*'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This role grants wildcard permissions (*/*/*), which is extremely permissive and violates the principle of least privilege. This poses a significant security risk. The permissions for dynamo-platform-dynamo-operator-manager-role should be scoped down to only what is absolutely necessary for the component to function.

app.kubernetes.io/managed-by: Helm
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The roleRef in this RoleBinding is set to ClusterRole, but the referenced dynamo-platform-dynamo-operator-proxy-role is defined as a Role on line 750. This will cause the binding to fail. Please change the kind to Role.

kind: Role


echo "model downloaded, start serving"

dynamo serve graphs.disagg:PrefillWorker --system-app-port 5000 --enable-system-app --use-default-health-checks --service-name PrefillWorker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PrefillWorker service appears to be configured to serve graphs.disagg:PrefillWorker, but this file is for the disagg_router configuration. This seems like a copy-paste error and will likely cause the service to fail or misbehave. It should probably be graphs.disagg_router:PrefillWorker.

dynamo serve graphs.disagg_router:PrefillWorker --system-app-port 5000 --enable-system-app --use-default-health-checks --service-name PrefillWorker

Comment on lines 16 to 27
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: pod-read-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: pod-read
subjects:
- kind: ServiceAccount
name: default
namespace: default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Binding a ClusterRole to the default service account in the default namespace is a security risk. Any pod in that namespace that does not specify a serviceAccountName will inherit these permissions. It's much safer to create a dedicated service account for the application that needs these permissions and bind the role to that specific service account.

- -c
- |

apt update && apt install wget -y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Installing packages at container startup is inefficient and can make the deployment unreliable if package repositories are unavailable. It's better to include necessary tools like wget in the Docker image itself. This applies to VllmWorker (line 105) and PrefillWorker (line 138) as well.

Comment on lines +102 to +107
- name: model-vol
hostPath:
path: /data01/models
type: Directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using hostPath volumes ties the pod to a specific node that has this path, which reduces portability and can cause scheduling issues. For tests, it's better to use a solution like emptyDir with an init container to download models, or a PersistentVolumeClaim if the data needs to persist. This applies to the decode role as well.

Comment on lines +83 to +86
- name: model-vol
hostPath:
path: /data01/models
type: Directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using hostPath volumes ties the pod to a specific node that has this path, which reduces portability and can cause scheduling issues. For tests, it's better to use a solution like emptyDir with an init container to download models, or a PersistentVolumeClaim if the data needs to persist. This applies to the decode role as well.

@@ -0,0 +1,146 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename folder name from dynamo-test to dynamo

@nwangfw nwangfw force-pushed the ning/release-benchmarking branch from 759de2d to 6e8b357 Compare July 30, 2025 20:41
@nwangfw nwangfw force-pushed the ning/release-benchmarking branch from 6e8b357 to 24c0ede Compare July 30, 2025 20:45
@Jeffwan Jeffwan changed the title Adding test sglang, vllm, and dynamo yamls Adding v0.4.0 sglang, vllm, and dynamo test yamls Jul 30, 2025
@Jeffwan Jeffwan changed the title Adding v0.4.0 sglang, vllm, and dynamo test yamls Adding v0.4.0 vllm, sglang and dynamo test yamls Jul 30, 2025
@Jeffwan Jeffwan merged commit 4edf6bb into vllm-project:main Jul 30, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v0.4.0 release performance regression test
2 participants