runpod-playground

Steps

cd /workspace
git clone https://github.com/ilkersigirci/runpod-playground.git
cd /workspace/runpod-playground

# Prepare .env file
make prepare-env-file

# Initial dependency install
make initial-runpod-install

# Download model
make download-model

# Start vllm
make start-vllm

# See vllm logs
make log-vllm

# Restart vllm
make restart-vllm

# Start the simple gui
make gui

Api healthcheck is enabled by default, which sends a message to the vllm server in fixed period of time.
- To disable healthcheck, ENABLE_HEALTH_CHECK=0 should be set in .env file.
To send the healthcheck failure message to Microsoft Teams, TEAMS_WEBHOOK_URL should be set in .env file.
- Example: TEAMS_WEBHOOK_URL=https://outlook.office.com/webhook/...
To deploy different model, in .env file, change HF_MODEL_NAME variable to the model name you want to deploy by following hunggingface repository id convention.
Also you can change SERVED_MODEL_NAME to specify model name for requests.
One can also change MAX_CONTEXT_LEN variable to the desired context length.
Example: Change default model and its context length to CohereForAI/c4ai-command-r-plus-GPTQ

make replace-value-in-env-file variable_name=HF_MODEL_NAME new_value=CohereForAI/c4ai-command-r-plus-GPTQ
make replace-value-in-env-file variable_name=MAX_CONTEXT_LEN new_value=40000

cURL Examples

Request with system message assuming SERVED_MODEL_NAME=vLLM-Model

curl --request POST \
    --url http://0.0.0.0:8000/v1/chat/completions \
    --header "Content-Type: application/json" \
    --data '{
  "model": "vLLM-Model",
  "messages": [
  {
      "role": "system",
      "content": "You are a helpful virtual assistant trained by OpenAI."
  },
  {
    "role": "user",
    "content": "Who are you?"
  }
  ], 
  "temperature": 0.8,
  "stream": false
}'

Request without system message assuming SERVED_MODEL_NAME=vLLM-Model

curl --request POST \
    --url http://0.0.0.0:8000/v1/chat/completions \
    --header "Content-Type: application/json" \
    --data '{
  "model": "vLLM-Model",
  "messages": [
  {
    "role": "user",
    "content": "Who are you?"
  }
  ], 
  "temperature": 0.8,
  "stream": false
}'

Name	Name	Last commit message	Last commit date
Latest commit ilkersigirci vllm parameters added Jan 20, 2025 37741fd · Jan 20, 2025 History 74 Commits
.vscode	.vscode	vllm and uv upgraded	Aug 26, 2024
configs/tabbyAPI	configs/tabbyAPI	moved from rye to uv && vllm upgraded	Sep 10, 2024
docker	docker	runpod api example && dockerfile added	Mar 19, 2024
huggingface	huggingface	initial commit	Mar 19, 2024
models	models	initial commit	Mar 19, 2024
notebooks	notebooks	general improvement	Sep 25, 2024
prompt_templates	prompt_templates	moved from rye to uv && vllm upgraded	Sep 10, 2024
runpod_playground	runpod_playground	general improvement	Sep 25, 2024
scripts	scripts	vllm parameters added	Jan 20, 2025
.env.example	.env.example	docs updated	Dec 18, 2024
.gitignore	.gitignore	.env file to .env.example	Nov 20, 2024
.python-version	.python-version	runpod api example && dockerfile added	Mar 19, 2024
Makefile	Makefile	docs updated	Dec 18, 2024
README.md	README.md	docs updated	Dec 18, 2024
pyproject.toml	pyproject.toml	transformers updated for c4ai-command-r7b-12-2024 support	Jan 15, 2025
uv.lock	uv.lock	transformers updated for c4ai-command-r7b-12-2024 support	Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

runpod-playground

Steps

cURL Examples

TabbyAPI Prompt Templates

About

Languages

ilkersigirci/runpod-playground

Folders and files

Latest commit

History

Repository files navigation

runpod-playground

Steps

cURL Examples

TabbyAPI Prompt Templates

About

Resources

Stars

Watchers

Forks

Languages