-
Notifications
You must be signed in to change notification settings - Fork 407
Adding v0.4.0 vllm, sglang and dynamo test yamls #1352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @nwangfw, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly expands the testing infrastructure by adding new Kubernetes YAML configurations for deploying and testing disaggregated large language model (LLM) inference systems across Dynamo, SGLang, and VLLM frameworks. It also updates existing SGLang configurations to use a new disaggregation transfer backend, ensuring broader compatibility and improved performance testing.
Highlights
- New Test Configurations: Added new YAML configurations for testing various deployment scenarios for SGLang, VLLM, and Dynamo, expanding the test suite for large language model (LLM) inference systems.
- Dynamo Platform Deployment: Introduced comprehensive Kubernetes manifests (
namespace.yaml
,platform.yaml
) to deploy the core 'dynamo-platform' components, including etcd, NATS, and the dynamo-operator, providing a foundational setup for Dynamo-based LLM deployments. - Disaggregated LLM Deployments (Dynamo): Included new
DynamoGraphDeployment
YAMLs (disagg.yaml
,disagg_router.yaml
) that define disaggregated LLM inference setups with distinct Frontend, Processor, VllmWorker, PrefillWorker, and Router services, specifying resource requirements and model download procedures. - SGLang Router Configurations: Added multiple
StormService
YAMLs for SGLang (sglang-router-1p1d-tp2.yaml
,sglang-router-1p1d.yaml
,sglang-router-4p3d.yaml
) to test various prefill/decode replica and tensor parallelism configurations, leveraging RDMA for inter-pod communication. - SGLang Disaggregation Backend Update: Modified existing SGLang configurations (
sglang-router-cache.yaml
,sglang-router-random.yaml
,aibrix-router.yaml
) to switch thedisaggregation-transfer-backend
fromnixl
tomooncake
, indicating an update in the underlying data transfer mechanism. - VLLM Disaggregated Deployment: Added a
StormService
YAML (1p1d.yaml
) for VLLM, configuring a 1 prefill + 1 decode replica setup, utilizing theNixlConnector
for KV transfer and RDMA for high-performance communication. - Kubernetes RBAC for SGLang: Introduced Kubernetes RBAC configurations (
rbac-sglang-router.yaml
) to grant necessary permissions (get, watch, list pods) to the default service account for SGLang router functionality.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a number of YAML files for testing sglang
, vllm
, and dynamo
. While the additions are comprehensive, I've identified several critical security vulnerabilities, such as hardcoded secrets and overly permissive RBAC roles, that must be addressed. There are also correctness issues that would cause deployment failures and maintainability concerns like using hostPath
volumes and installing packages at runtime. Please see the detailed comments for suggestions on how to resolve these issues.
app.kubernetes.io/managed-by: Helm | ||
type: kubernetes.io/dockerconfigjson | ||
data: | ||
.dockerconfigjson: ewogICAgImF1dGhzIjogewogICAgICAiYWlicml4LWNvbnRhaW5lci1yZWdpc3RyeS1jbi1iZWlqaW5nLmNyLnZvbGNlcy5jb20vYWlicml4IjogewogICAgICAgICJ1c2VybmFtZSI6ICJuaW5nLndhbmcxQGJ5dGVkYW5jZS5jb21AMzcyNjA5MDYiLAogICAgICAgICJwYXNzd29yZCI6ICJleUowZVhBaU9pSktWMVFpTENKaGJHY2lPaUpTVXpJMU5pSXNJbXRwWkNJNklsRk9WalU2U2xWRFZqcEhUa1pXT2xjMlRFYzZSVnBKU3pwYVVqWldPa0ZLVURNNlZWSXlUanBQTjAxTE9sVmFOMGc2TlZWR1V6cFpRMWxXSW4wLmV5SnBjM01pT2lKamNpMXBjM04xWlhJaUxDSnpkV0lpT2lKdWFXNW5MbmRoYm1jeFFHSjVkR1ZrWVc1alpTNWpiMjFBTXpjeU5qQTVNRFlpTENKaGRXUWlPbHNpWTNJaVhTd2laWGh3SWpveE56VXpOelV5TmpVMExDSnVZbVlpT2pFM05UTTNORGt3TlRRc0ltbGhkQ0k2TVRjMU16YzBPVEExTkN3aWFuUnBJam9pWlRkS1kyTkNSa2RPV2tSME1YUkZTU0lzSW1sa1pXNTBhWFI1WDNSNWNHVWlPaUoxYzJWeUlpd2lhV1JsYm5ScGRIbGZZV05qYjNWdWRGOXBaQ0k2TWpFd01EWTVNalF5T0N3aWFXUmxiblJwZEhsZmRYTmxjbDlwWkNJNk16Y3lOakE1TURZc0ltbHViR2x1WlY5d2IyeHBZM2tpT2lKSU5ITkpRVUZCUVVGQlFVTXZObkYxUWxGUlFVRlFMeTlSTnl0dGIzZEpRVUZCUVNKOS5RenR5eS1iSlBoZUlfWFJ5VUVCS1llXzJzNGUwdGpFMTFtRnBET0M5RVBja1hHOENZQXdXYXR4ZnRJeGFZQTVEalAxc3RZVUwxS2lKdkxDNWNDX0hqZDFTSDB6NW45bTZJR1Z0c2pSTEt1dlQtSS16Q1B2dnZhTEdzTTNKS3BUdDhRemhPcFplUlZHVnFhNXNJS2lpQjFDUVVDUDByS013VHZSUElaQldoN3V5TEdNVk00VXFlUkpsLUFYeDRHMjhmWDhwTF9JX2RMQzdlTkZRYl96M3hfNWFIZzNkbVFYcTVOQzB3dVIxR1YwRTliU1ppSDZwZ0pyVHZrU0ZMNkcwOHlRd2EyQ2M5Y1pvNGhGUWR2MTB2SnAtYWg2WkdkSi1ISFpMTlprNkplZXlzYWNEX3FVa01nQTY1WFNpSW9DbFNRalF1LU5hWjZuNmZ3Z0NPd0dHMnM4cF9HMTM3WDRIWnVILXRDX2Rnc1VKbDhhdk02VnFNenU4ZUZzb1l0RFVOaFNLU3FBZDlBZWo1T1IzRGNTMWRid19lQm5OZ3cwTnhxVFJsVEdkWDZPRTk2MFRoOW5qZUhTYVVfR2g5cTZPQWV3anptRWVGMzNqc2xMYWVxa2xVVnV2Q2xWOXl0UzVOV3Z4NzVzSmRjSU9LUVlHMlo2U29GNUNtNWtuczlOZjlYRlM0Q1lSV1ZzUGxSSVdEWDVvWFB3c1AtUXNyUmZKek93NlNSN2pueXloX2JWQi1kUWJFcnlMY1BESkcxQVpGaEx6TzR1VHhGRVpLdHkySF9JWW5TMU4yN0Q0UHdJdEVnNlROaEN0RWJLa0U2TFBNajVyZHB1bWRrOWNVMFJUWU1XQy1WWFdxcHp0aUZsR3dZXzRCNlpxTVhzR3o0ekVfb3ZjQWIwcHRyZyIsCiAgICAgICAgImF1dGgiOiAiYm1sdVp5NTNZVzVuTVVCaWVYUmxaR0Z1WTJVdVkyOXRRRE0zTWpZd09UQTJPbVY1U2pCbFdFRnBUMmxLUzFZeFVXbE1RMHBvWWtkamFVOXBTbE5WZWtreFRtbEpjMGx0ZEhCYVEwazJTV3hHVDFacVZUWlRiRlpFVm1wd1NGUnJXbGRQYkdNeVZFVmpObEpXY0VwVGVuQmhWV3BhVjA5clJrdFZSRTAyVmxaSmVWUnFjRkJPTURGTVQyeFdZVTR3WnpaT1ZsWkhWWHB3V2xFeGJGZEpiakF1WlhsS2NHTXpUV2xQYVVwcVkya3hjR016VGpGYVdFbHBURU5LZW1SWFNXbFBhVXAxWVZjMWJreHVaR2hpYldONFVVZEtOV1JIVm10WlZ6VnFXbE0xYW1JeU1VRk5lbU41VG1wQk5VMUVXV2xNUTBwb1pGZFJhVTlzYzJsWk0wbHBXRk4zYVZwWWFIZEphbTk0VG5wVmVrNTZWWGxPYWxVd1RFTktkVmx0V1dsUGFrVXpUbFJOTTA1RWEzZE9WRkZ6U1cxc2FHUkRTVFpOVkdNeFRYcGpNRTlVUVRGT1EzZHBZVzVTY0VscWIybGFWR1JMV1RKT1ExSnJaRTlYYTFJd1RWaFNSbE5UU1hOSmJXeHJXbGMxTUdGWVVqVllNMUkxWTBkVmFVOXBTakZqTWxaNVNXbDNhV0ZYVW14aWJsSndaRWhzWmxsWFRtcGlNMVoxWkVZNWNGcERTVFpOYWtWM1RVUlpOVTFxVVhsUFEzZHBZVmRTYkdKdVVuQmtTR3htWkZoT2JHTnNPWEJhUTBrMlRYcGplVTVxUVRWTlJGbHpTVzFzZFdKSGJIVmFWamwzWWpKNGNGa3phMmxQYVVwSlRraE9TbEZWUmtKUlZVWkNVVlZOZGs1dVJqRlJiRVpTVVZWR1VVeDVPVkpPZVhSMFlqTmtTbEZWUmtKUlUwbzVMbEY2ZEhsNUxXSktVR2hsU1Y5WVVubFZSVUpMV1dWZk1uTTBaVEIwYWtVeE1XMUdjRVJQUXpsRlVHTnJXRWM0UTFsQmQxZGhkSGhtZEVsNFlWbEJOVVJxVURGemRGbFZUREZMYVVwMlRFTTFZME5mU0dwa01WTklNSG8xYmpsdE5rbEhWblJ6YWxKTVMzVjJWQzFKTFhwRFVIWjJkbUZNUjNOTk0wcExjRlIwT0ZGNmFFOXdXbVZTVmtkV2NXRTFjMGxMYVdsQ01VTlJWVU5RTUhKTFRYZFVkbEpRU1ZwQ1YyZzNkWGxNUjAxV1RUUlZjV1ZTU213dFFWaDRORWN5T0daWU9IQk1YMGxmWkV4RE4yVk9SbEZpWDNvemVGODFZVWhuTTJSdFVWaHhOVTVETUhkMVVqRkhWakJGT1dKVFdtbElObkJuU25KVWRtdFRSa3cyUnpBNGVWRjNZVEpEWXpsaldtODBhRVpSWkhZeE1IWktjQzFoYURaYVIyUktMVWhJV2t4T1dtczJTbVZsZVhOaFkwUmZjVlZyVFdkQk5qVllVMmxKYjBOc1UxRnFVWFV0VG1GYU5tNDJabmRuUTA5M1IwY3ljemh3WDBjeE16ZFlORWhhZFVndGRFTmZaR2R6VlVwc09HRjJUVFpXY1UxNmRUaGxSbk52V1hSRVZVNW9VMHRUY1VGa09VRmxhalZQVWpORVkxTXhaR0ozWDJWQ2JrNW5kekJPZUhGVVVteFVSMlJZTms5Rk9UWXdWR2c1Ym1wbFNGTmhWVjlIYURseE5rOUJaWGRxZW0xRlpVWXpNMnB6YkV4aFpYRnJiRlZXZFhaRGJGWTVlWFJUTlU1WGRuZzNOWE5LWkdOSlQwdFJXVWN5V2paVGIwWTFRMjAxYTI1ek9VNW1PVmhHVXpSRFdWSlhWbk5RYkZKSlYwUllOVzlZVUhkelVDMVJjM0pTWmtwNlQzYzJVMUkzYW01NWVXaGZZbFpDTFdSUllrVnllVXhqVUVSS1J6RkJXa1pvVEhwUE5IVlVlRVpGV2t0MGVUSklYMGxaYmxNeFRqSTNSRFJRZDBsMFJXYzJWRTVvUTNSRllrdHJSVFpNVUUxcU5YSmtjSFZ0WkdzNVkxVXdVbFJaVFZkRExWWllWM0Z3ZW5ScFJteEhkMWxmTkVJMlduRk5XSE5IZWpSNlJWOXZkbU5CWWpCd2RISm4iCiAgICAgIH0KICAgIH0KICB9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This secret contains hardcoded credentials, which is a critical security vulnerability. Secrets, including registry credentials, must never be stored in version control. Please remove this and manage credentials using a secure method, such as creating Kubernetes secrets out-of-band or using a secret management tool.
helm.sh/chart: etcd-11.1.0 | ||
type: Opaque | ||
data: | ||
jwt-token.pem: "LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlKS1FJQkFBS0NBZ0VBdkNCVlVhNFIwL1FpQ2owRmw3Y3h1SzNpTVI5MzkvaXRaOCtHQ3h4NEpPM1RVY2lWCmhhQ3cwUHZKMWlBUFREclVaQ3NDZ1VNSUZDczFtK3dhWHBtcTYyLytEMkVndmNkRXA2bEl2UmdIdXE0SGIzNFkKWXRaKzBlNXVYaEdxWnV5aU05dllFcUNDbGhteXBZQmhvdVEwb0NnU3pLRXlxTzNSNlhDcUk0UFhpbk5aWlhrMwpTNXAwVXNwZ2hEY0NXMVEydnpKNWI1bFVLMzFocXhEdmNBNXhEckNUTGlwSmlQMFQvNzVzT1JTeThEV3UyaTFGClZBS3Y0bXllSHpqQUJZak9NL2h2OWI3VjJJTmpQNEdtQ0ZkUVR1U29ZUlhrdE5jQ0VkbHVlWXhZeHlyMHhqdnAKNWFCT2tJWnFNdUxWUzlTRTlqVk4rYzRaNEtjYlBXTTdOOCtjcVZGbUp1ZlF3SVJoMEd3VDFIbFR6ditwM0h2bQowVDd4R0oydlg5V3JCVTJqZVJmaHFTVVFXMldSejVrcmRIVVR2ZkZjQmkxdnRiQi95UkhzeVA5ZjRaSlZMazBGCnByZW5XVFQ0dHltS2haK1crdjVVZlExbmh6ZlZPRXluYndyVEFiMHRuZXJPeUw0dEhjVk05R0Q4TjZLZGs4OFQKZ3NWTnRvdzlHNGRmU2RYQXlxdWNDRWkzRU14NVN3YmZENzVhMkpmRnkvU3cwc0p4bTZVQ3o1RnBidjFxRFRIdAovaVdVUkt0b1ZCbC9mOENuNUd6YmV6TjJTbkRScGlaUFZuR0lQbVlkQ2NwU1F5Ny9JVmFiTGRGYnVPTEZqdkpTCmRWMTh1UUFQQjA2anVBd1E2VjJJbzhoWlovVGZnV1BIQllFSG9Od3gxdmVSY3hJUW56MUJZZHN6b21VQ0F3RUEKQVFLQ0FnQnU0VlBJTDZPY1hHRVB6Y1dWRlVsK0F5YTFNMG4vWXBlNS91dVBjdmt4TXVlQlpZc1RWcEhBcXhiYwpXTXNMY0FVdEFOaWFIVXNPc0NVSndKQU9QOTRvcERxSEVBVS9OY3EyVXlWbkpsbXY0dERKR3ZYK2sxUHVJZXoxCkRzNXptWGRFdjRmTCtzVGMrVXA5RjN6MGlES29UdFk5dTZqalBrQ2hxV0R1UzgvbmIzTTZSc2FiamU2UHVXNUgKbnpDSmFjcFd5TVlnUzZ2N2NndjF5cWFGVWsyNmpjSkJ6SFRKQS9rY25EeW5MQmNIVTJ5NmRVSk9jTUwvM21zWQpGZGUvcTFGMStqRHJFODV3cllmNnIweVB6LzZxY1VUNDYvRi8wTllNZFM0Uy8vQjNlMDhINDlwSkdFSVdGdE5UCktMVDAyODUrV3c2WmRLdDE2S2c2ZDFrNDNPb0ZNc2RyUWQxN2VJK1VoVkZJaFJ0M1U2Q2lhdWF2MHlRKzZZUysKVGlSK3hJVllmOVl6MXFaU2JYVEQ1aXhlWmFqVEN2UE14am5Vc3VFUnFxSFo3Ym1sM21YU3lKWkdDUDlMd3RVcwpJUVUvcE9lVjBBWDROMURDS2V3Q1plbmhZMVBZZVloeXhscS81Mm1PYXZab0JET01vb1NxbG5UY3Y0NzlqbDFzClo5dnFHK1hUSE9mNERtdk9YOHJ3NVk2MU9xbWtmb25JaDhmRkphK0NUTDE5WEFFVjZkK0x1M2FMeUFFRVFqMXoKYjg5N0hpbzBEcDh5LzlIaGphTTJBODhyTlN5ODlBb1Y2akx2R2lVNVVTL0loaERIZkEyd2hYVHNPOU5POEJzeApvR3NXeW55ZklvVnhRWG5NbUp6dk0zOWdGQUF5bFZ6Z2pJRkhCQmdZWlRKc0pnRUJSUUtDQVFFQTBJQVhhNGhqCkdPN1dGRW9UMUZ0Z0VqQjhyemlnY1FRYWJIbXhuSjQ2RGIyNThYakpnUjZTZVVJNVhIVFZBYTYrTVNDaUNBc3EKcGV2T2FhMldaRzR5NjNSSkpGV0s5MWQ0NWVGQ1NaYVgvZXNTZVpXTXA1T09MSDRCOFdoOEo2RGFWRitpdEtHQwpqa0ViN2t5aTlSZzdUQmR5UFVQeXhnTjBNR1paYmpHOHFUcVZkSTl1dXZudUZzRFd3MDVaQk4rYm10YTJ5eHZJCjFuRUdxdlkxT0xYU0tNcHErQ21JTEZLaGlTWFhxOVc3VzQvTzRwOCt2cHVlaFVqUDFXTVd1aS85V0traVFQWXoKVVhndWI4WDVYemRJVDRBZ0hoRGttTTIva2IyanhKdEFMeWRySkx3YVJOejd1azFVUnN3WjZoUk5hb3pLU2srTgpmVHl5bUduTGJ2OEhnd0tDQVFFQTV2d0NyNWdyd01jWDdBMk5oSE1QRVNnaVdJMko4RW52dlQ2Q2NVSlB2d25YCkFFd0ZDSjhnVmFJM1RqQXpTanJQZHVuRElZMHAyNVdGRER0N2lYVDV1SnZyOHd0dVdyeW5uNGxacGRjc2ZnT2MKbUx4ZHNraHIxMVZvVkFVa1RxcHNuaVNsbkw4OGU4VTh6Um5YTk44OTB1RlJhMUp0Y2dITEltNkZ4NXJNbC9hQgpBL0d4UlZkT09vVWlDSXNpaEtlUDFKY3VqSEZEbVEvQW1wYVpqZmE2MEVlT2xwM2Z2UFJIaWt6SmpQdzJyb0V3CklYQ1RjUzd3QnY1bmpQNUJjMGZpSkVwekpqdTN3Lzd6SDlKcXB6R0JhUU5oS3owc2JGUXZtZTBKZzVLNndCZlEKcVFabzdVS29OOVNRcEZmQkd0NXJRbWhWN3p3MnJmSDlkTGJySE02aDl3S0NBUUJFZ0pFeFIxSW5zdkNySWR0NQovYnVHdmw5NHdrWVVXNkRDMndpN3lGQ1NpeTg0WXRMeGxUWFNtU3QzY1BzNjNrcW1vVzRhN25CSXF6emFBNnBSCnZOTUdBT0szQ1ZQckI5WTAwTm1TaStQTURSdTR6cVd6dVFzUkd5VlFSMS9qTlRvR1ZwV0NlejRGdHVhU1dLT0IKbTBDQU00UFpGU1lDWU1nZnpzQzdMNlFvaFpMdEJPTUFrM1RpRE9HQ3NZOFZ4dEc5QWFaMC9OL3pkSmNZUEdRbwpDYlRPcWhLVERqSXJQY3JNNEo0SFd1Vi9vN1A1U1JDNFZ6Q3g0Y1pHK1pOU0FNcGdBdEc2UGltR0x0SXk3dnlMCmlBSTkvSDZRNGtQNDRFQjhoYTMwdm4xTUFNdDIra1kzTWRPUzFpMUdSWENPTTdWcnVHNE9GQzBNWkpEWStvbFkKSW5vNUFvSUJBUUN4YW13ZHQ0SVR4bGlIaFRsbnZJQ2NLdm1HanVDZVlndjBqYWpmbVpGcktRR3RScmF2RXBwaApiQUdTMnM3cWV4V1RQcVlVaFRyYllKTWEvQzA5UFUzNkVEbEJLMHlPQVUyMjZtYWpUUXp0bFpTWWRFWjdnOGpoClVzYVprb0Y2c3NoUEI5TDAvS1pXWXY3NmdwaWNFY1NJejJ3Y1JPbXZvZFVIRTJGS0dZcUFSbVIrUlF4bU9GTHIKL0FwU0NaeHZ5OHhiKyt0VEVFUmRzbzI3cjJVdWQ0NnlIK1JUcDl1TVpRejJHMWpNSTRWVXNBSHFEWEFTaFh6egp3a0NET1YvSVRqUlNyNlNXZjFQOUJUeDArc2ZVdDkrd2Exb2RQaFY2OTRiQWNpTFduc00yZkpSVzM0ZUVoYWh6CmlaKzN1M2VsVGFXUkVsYS9OTDdEQ0pZbktjQ1pGZFd0QW9JQkFRQzcxRkdybkcxVHQ3bGZObEdYMmlmVzJZcW4KZHhUN25tamVsMlVZNk5idjYxMmcyMGgrQXlSZzJrd1A0WDFVRzljZ05LVkxHZXhnMHN2SU12c1pyR1V3anNkRQpzbXI0SjJQbzZaSmd5c2N1SVI5Vkx0TUxhdXlWQ1RPbjQzVTE1WTNsK3lxVkFPYjlGQkQxQzE4VHVVT2hKMkpXCkhqb2dXRGd2bW1RK1JpZWxuQzlTWXUwZTVWaCt4TDZOaGFSWjhqU1N2SEZrYW1MRUlrSkhROWtxY3dlTnFGUHYKN0lhRjJ0eWhuakY5ZUlpZnlSMGpFMXVNNkFFQ0tlbXBzbVJQNkxFNld6aitaZjN3RDJSaU5SYUxya1U5K1Q4VwpYSFlzejh4RUdldEJ2NmFqSzRrMmNrak0ydllaYkszNlV5c2JYMzhEMVBGeDZrOEZuRXNHT2xPTW9ycEQKLS0tLS1FTkQgUlNBIFBSSVZBVEUgS0VZLS0tLS0K" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wget https://tos-tools.tos-cn-beijing.volces.com/linux/amd64/tosutil | ||
chmod +x tosutil | ||
./tosutil config -i <YOUR_ACCESS_KEY_ID> -k <YOUR_SECRET_ACCESS_KEY> -e tos-cn-beijing.ivolces.com -re cn-beijing | ||
./tosutil cp tos://aibrix-artifact-testing/models/Qwen3-8B ./models -r -p 8 -j 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The container command includes placeholders for access keys. This is a security risk as it encourages hardcoding secrets. These credentials should be provided securely, for example, by mounting them as environment variables from a Kubernetes secret.
This issue is repeated for VllmWorker
(line 109) and PrefillWorker
(line 142). Consider using envFrom
with a secretRef
to inject these values securely.
resources: | ||
- '*' | ||
verbs: | ||
- '*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This role grants wildcard permissions (*/*/*
), which is extremely permissive and violates the principle of least privilege. This poses a significant security risk. The permissions for dynamo-platform-dynamo-operator-manager-role
should be scoped down to only what is absolutely necessary for the component to function.
app.kubernetes.io/managed-by: Helm | ||
roleRef: | ||
apiGroup: rbac.authorization.k8s.io | ||
kind: ClusterRole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
echo "model downloaded, start serving" | ||
|
||
dynamo serve graphs.disagg:PrefillWorker --system-app-port 5000 --enable-system-app --use-default-health-checks --service-name PrefillWorker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PrefillWorker
service appears to be configured to serve graphs.disagg:PrefillWorker
, but this file is for the disagg_router
configuration. This seems like a copy-paste error and will likely cause the service to fail or misbehave. It should probably be graphs.disagg_router:PrefillWorker
.
dynamo serve graphs.disagg_router:PrefillWorker --system-app-port 5000 --enable-system-app --use-default-health-checks --service-name PrefillWorker
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRoleBinding | ||
metadata: | ||
name: pod-read-binding | ||
roleRef: | ||
apiGroup: rbac.authorization.k8s.io | ||
kind: ClusterRole | ||
name: pod-read | ||
subjects: | ||
- kind: ServiceAccount | ||
name: default | ||
namespace: default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binding a ClusterRole
to the default
service account in the default
namespace is a security risk. Any pod in that namespace that does not specify a serviceAccountName
will inherit these permissions. It's much safer to create a dedicated service account for the application that needs these permissions and bind the role to that specific service account.
- -c | ||
- | | ||
|
||
apt update && apt install wget -y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: model-vol | ||
hostPath: | ||
path: /data01/models | ||
type: Directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using hostPath
volumes ties the pod to a specific node that has this path, which reduces portability and can cause scheduling issues. For tests, it's better to use a solution like emptyDir
with an init container to download models, or a PersistentVolumeClaim
if the data needs to persist. This applies to the decode
role as well.
- name: model-vol | ||
hostPath: | ||
path: /data01/models | ||
type: Directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using hostPath
volumes ties the pod to a specific node that has this path, which reduces portability and can cause scheduling issues. For tests, it's better to use a solution like emptyDir
with an init container to download models, or a PersistentVolumeClaim
if the data needs to persist. This applies to the decode
role as well.
@@ -0,0 +1,146 @@ | |||
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename folder name from dynamo-test
to dynamo
test/regression/v0.4.0/sglang/qwen-8b/sglang-router-1p1d-tp2.yaml
Outdated
Show resolved
Hide resolved
test/regression/v0.4.0/sglang/qwen-8b/sglang-router-1p1d-tp2.yaml
Outdated
Show resolved
Hide resolved
759de2d
to
6e8b357
Compare
Signed-off-by: Ning Wang <[email protected]>
Signed-off-by: Ning Wang <[email protected]>
…ch role Signed-off-by: Ning Wang <[email protected]>
…ch role Signed-off-by: Ning Wang <[email protected]>
6e8b357
to
24c0ede
Compare
Pull Request Description
Adding test sglang, vllm, and dynamo yamls
Related Issues
Resolves: #[1336]
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]
: Corrections to existing functionality[CI]
: Changes to build process or CI pipeline[Docs]
: Updates or additions to documentation[API]
: Modifications to aibrix's API or interface[CLI]
: Changes or additions to the Command Line Interface[Misc]
: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.