Skip to content

Add tutorial for building a dual-mode Serverless worker #274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,8 @@
"serverless/development/debugger",
"serverless/development/concurrency",
"serverless/development/environment-variables",
"serverless/development/test-response-times"
"serverless/development/test-response-times",
"serverless/development/dual-mode-worker"
]
}
]
Expand Down
361 changes: 361 additions & 0 deletions serverless/development/dual-mode-worker.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,361 @@
---
title: "Build a dual-mode Serverless worker"
sidebarTitle: "Build a dual-mode worker"
description: "Create a flexible Serverless worker that supports a Pod-first development workflow."
---

Developing machine learning and AI applications often requires powerful GPUs, making local development of API endpoints challenging. A typical development workflow for [Serverless](/serverless/overview) would be to write your handler code, deploy it directly to a Serverless endpoint, send endpoint requests to test, debug using worker logs, and repeat.

This can have signifcant drawbacks, such as:

* **Slow iteration**: Each deployment requires a new build and test cycle, which can be time-consuming.
* **Limited visibility**: Logs and errors are not always easy to debug, especially when running in a remote environment.
* **Resource constraints**: Your local machine may not have the necessary resources to test your application.

This tutorial shows how to build a "Pod-first" development environment: creating a flexible, dual-mode Docker image that can be deployed as either a Pod or a Serverless worker.

Using this method, you'll leverage a [Pod](/pods/overview)—a GPU instance ideal for interactive development, with tools like Jupyter Notebooks and direct IDE integration—as your cloud-based development machine. The Pod will be deployed with a flexible Docker base, allowing the same container image to be seamlessly deployed to a Serverless endpoint.

This workflow lets you develop and thoroughly test your application using a containerized Pod environment, ensuring it works correctly. Then, when you're ready to deploy to production, you can deploy it instantly to Serverless.

Follow the steps below to create a worker image that leverages this flexibility, allowing for faster iteration and more robust deployments.

<Tip>

To get a basic dual-mode worker up and running immediately, you can [clone this repository](https://github.com/justinwlin/Runpod-GPU-And-Serverless-Base) and use it as a base.

</Tip>

## What you'll learn

In this tutorial you'll learn how to:

* Set up a project for a dual-mode Serverless worker.
* Create a handler file (`handler.py`) that adapts its behavior based on a user-specified environment variable.
* Write a startup script (`start.sh`) to manage different operational modes.
* Build a Docker image designed for flexibility.
* Understand and utilize the "Pod-first" development workflow.
* Deploy and test your worker in both Pod and Serverless environments.

## Requirements

* You've [created a RunPod account](/get-started/manage-accounts).
* You've installed [Python 3.x](https://www.python.org/downloads/) and [Docker](https://docs.docker.com/get-started/get-docker/) on your local machine and configured them for your command line.
* Basic understanding of Docker concepts and shell scripting.

## Step 1: Set up your project structure

First, create a directory for your project and the necessary files.

Open your terminal and run the following commands:

```sh
mkdir dual-mode-worker
cd dual-mode-worker
touch handler.py start.sh Dockerfile requirements.txt
```

This creates:

- `handler.py`: Your Python script with the RunPod handler logic.
- `start.sh`: A shell script that will be the entrypoint for your Docker container.
- `Dockerfile`: Instructions to build your Docker image.
- `requirements.txt`: A file to list Python dependencies.

## Step 2: Create the `handler.py` file

This Python script will contain your core logic. It will check for a user-specified environment variable `MODE_TO_RUN` to determine whether to run in Pod or Serverless mode.

Add the following code to `handler.py`:

```python
import runpod
import os
import asyncio
import time

# Determine the operational mode. Defaults to 'pod' if not set.
MODE_TO_RUN = os.getenv("MODE_TO_RUN", "pod")

async def handler(event):

#This function processes incoming requests to your Serverless endpoint
# or runs a test job if in 'Pod' mode.

print(f"Handler invoked in {MODE_TO_RUN} mode.")

input_data = event.get('input', {})
prompt = input_data.get('prompt', 'default prompt')
delay = input_data.get('delay', 1)

print(f"Received prompt: {prompt}")
print(f"Processing for {delay} seconds...")

# Simulate work
await asyncio.sleep(delay)

return {"output": f"Processed prompt: '{prompt}' after {delay}s in {MODE_TO_RUN} mode."}

# Start the Serverless function or run a test based on the mode
if __name__ == '__main__':
if MODE_TO_RUN == "serverless":
print("Starting RunPod Serverless worker...")
runpod.serverless.start({
"handler": handler
})
elif MODE_TO_RUN == "pod":
print("Running in Pod mode. Simulating a test call to the handler.")
# This block allows direct testing of the handler in a pod environment
async def main_test_pod():
test_event = {
"input": {
"prompt": "Pod test call!",
"delay": 2
}
}
result = await handler(test_event)
print("--- Pod Mode Test Handler Output ---")
print(result)
print("------------------------------------")

asyncio.run(main_test_pod())
else:
print(f"Unknown MODE_TO_RUN: {MODE_TO_RUN}. Exiting.")

```

Key features:

* `MODE_TO_RUN = os.getenv("MODE_TO_RUN", "pod")`: Reads the mode from an environment variable, defaulting to `pod`.
* `async def handler(event)`: Your core logic. It's an `async` function as required by `runpod.serverless.start`.
* `if __name__ == '__main__':`: This block controls what happens when the script is executed directly.
* In `serverless`" mode, it starts the RunPod Serverless worker.
* In `pod` mode, it runs a sample test call to your `handler` function, allowing for quick iteration.

## Step 3: Create the `start.sh` script

This script will be the entrypoint for your Docker container. It reads the `MODE_TO_RUN` environment variable and configures the container accordingly.

Add the following code to `start.sh`:

```bash
#!/bin/bash
set -e # Exit immediately if a command exits with a non-zero status.

echo "Container starting with MODE_TO_RUN=${MODE_TO_RUN}"

case $MODE_TO_RUN in
serverless)
echo "Starting in Serverless mode..."
# Execute the Python handler, which will start the runpod worker
exec python3 -u /app/handler.py
;;
pod)
echo "Starting in Pod mode..."
echo "Development services (e.g., Jupyter, SSH) would start here."
echo "The handler.py script will run its test harness if executed."
echo "You can connect to the Pod and manually run 'python /app/handler.py'."
# Keep the container running for interactive pod sessions
exec sleep infinity
;;
*)
echo "Error: Invalid MODE_TO_RUN value: '$MODE_TO_RUN'. Expected 'serverless' or 'pod'."
exit 1
;;
esac
```
Key features:
* `case $MODE_TO_RUN in ... esac`: This structure directs the startup based on the mode.
* `serverless` mode: Executes `handler.py`, which then starts the RunPod Serverless worker. `exec` replaces the shell process with the Python process.
* `pod` mode: Prints messages indicating it's ready for development. It then runs `sleep infinity` to keep the container alive so you can connect to it (e.g., via SSH or `docker exec`). You would then manually run `python /app/handler.py` inside the Pod to test your handler logic.
* `set -e`: Ensures the script exits if any command fails.

## Step 4: Create the `Dockerfile`

This file defines how to build your Docker image.

Add the following content to `Dockerfile`:

```dockerfile
# Use a standard Python base image
FROM python:3.10-slim

# Set the working directory in the container
WORKDIR /app

# Copy the requirements file and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code
COPY handler.py .
COPY start.sh .

# Set default environment variables
ENV PYTHONUNBUFFERED=1
ENV MODE_TO_RUN="pod" # Default to pod mode

# Make the startup script executable
RUN chmod +x /app/start.sh

# Set the entrypoint for the container
CMD ["/app/start.sh"]
```
Key features:
* `FROM python:3.10-slim`: Starts with a lightweight Python image.
* `WORKDIR /app`: Sets the current directory inside the container.
* `COPY requirements.txt .` and `RUN pip install ...`: Installs Python dependencies.
* `COPY handler.py .` and `COPY start.sh .`: Copies your application files.
* `ENV MODE_TO_RUN="pod"`: Sets the default operational mode to "Pod". This can be overridden at runtime.
* `RUN chmod +x /app/start.sh`: Makes your startup script executable.
* `CMD ["/app/start.sh"]`: Specifies `start.sh` as the command to run when the container starts.

## Step 5: Build and push your Docker image

<Info>

Instead of building and pushing your image via Docker Hub, you can also [deploy your worker from a GitHub repository](/serverless/workers/github-integration).

</Info>

Now, build your Docker image and push it to a container registry like Docker Hub.

<Steps>
<Step title="Build your Docker image">
Build your Docker image, replacing `[YOUR_USERNAME]` with your Docker Hub username and choosing a suitable image name:

```sh
docker build --platform linux/amd64 --tag [YOUR_USERNAME]/dual-mode-worker .
```
The `--platform linux/amd64` flag is important for compatibility with RunPod's infrastructure.
</Step>

<Step title="Push the image to your container registry">
```sh
docker push [YOUR_USERNAME]/dual-mode-worker:latest
```
<Note>

You might need to run `docker login` first.

</Note>
</Step>
</Steps>

## Step 6: Testing in Pod mode

Now that you've finished building our Docker image, let's explore how you would use the Pod-first development workflow in practice.

You can run your container locally with Docker:

```sh
docker run -e MODE_TO_RUN=pod --rm -it [YOUR_USERNAME]/dual-mode-worker
```

Or, deploy the image to a Pod:

1. Go to the [Pods page](https://www.runpod.io/console/pods) in the RunPod console and click **Create Pod**.
2. Choose a GPU.
3. For "Docker Image Name", enter `[YOUR_USERNAME]/dual-mode-worker:latest`.
4. Under **Pod Template**, select **Edit Template**.
5. Under **Public Environment Variables**, select **Add environment variable**. Set variable key to **`MODE_TO_RUN`** and the value to **`pod`**.
6. Select **Set Overrides**, then deploy your Pod.

After [connecting to the Pod](/pods/connect-to-pod), navigate to `/app` and run your handler directly:

```sh
python handler.py
```

This will execute the Pod-specific test harness in your `handler.py`, giving you immediate feedback. You can edit `handler.py` within the Pod and re-run it for rapid iteration.

## Step 7: Deploy to a Serverless endpoint

Once you're confident with your `handler.py` logic tested in Pod mode, you're ready to deploy your dual-mode worker to a Serverless endpoint.

1. Go to the [Serverless section](https://www.runpod.io/console/serverless) of the RunPod console.
2. Click **New Endpoint**.
3. Under **Custom Source**, select **Docker Image**, then select **Next**.
4. In the **Container Image** field, enter your Docker image URL: `docker.io/[YOUR_USERNAME]/dual-mode-worker:latest`.
5. Under **Advanced Settings > Environment Variables**, set `MODE_TO_RUN` to `serverless`.
6. Configure GPU, workers, and other settings as needed.
7. Select **Create Endpoint**.

The *same* image is used, but `start.sh` will now direct it to run in Serverless mode, starting the `runpod.serverless.start` worker.

## Step 8: Test your endpoint

After deploying your endpoint in to Serverless mode, you can test it with the following steps:

1. Navigate to your endpoint's detail page in the RunPod console.
2. Click the **Requests** tab.
3. Use the following JSON as test input:
```json
{
"input": {
"prompt": "Hello World!",
}
}
```
4. Click **Run**.

After a few moments for initialization and processing, you should see output similar to this:

```json
{
"delayTime": 12345, // This will vary
"executionTime": 3050, // This will be around 3000ms + overhead
"id": "some-unique-id",
"output": {
"output": "Processed prompt: 'Hello Serverless World!' after 3s in Serverless mode."
},
"status": "COMPLETED"
}
```

## Explore the Pod-first development workflow

Congratulations! You've successfully built, deployed, and tested a dual-mode Serverless worker. Now, let's explore the recommended iteration process for a Pod-first development workflow:

<Steps>
<Step title="Develop using Pod mode">
1. Deploy your initial Docker image to a RunPod Pod, ensuring `MODE_TO_RUN` is set to `pod` (or rely on the Dockerfile default).
2. [Connect to your Pod](/pods/connect-to-pod) (via SSH or web terminal).
3. Navigate to the `/app` directory.
4. As you develop, install any necessary Python packages (`pip install [PACKAGE_NAME]`) or system dependencies (`apt-get install [PACKAGE_NAME]`).
5. Iterate on your `handler.py` script. Test your changes frequently by running `python handler.py` directly in the Pod's terminal. This will execute the test harness you defined in the `elif MODE_TO_RUN == "pod":` block, giving you immediate feedback.
</Step>

<Step title="Update your Docker image">
Once you're satisfied with a set of changes and have new dependencies:
1. Add new Python packages to your `requirements.txt` file.
2. Add system installation commands (e.g., `RUN apt-get update && apt-get install -y [PACKAGE_NAME]`) to your `Dockerfile`.
3. Ensure your updated `handler.py` is saved.

</Step>

<Step title="Deploy and test in Serverless mode">

1. Deploy your worker image to a Serverless endpoint using [Docker Hub](/serverless/workers/deploy) or [GitHub](/serverless/workers/github-integration).
2. During deployment, ensure that the `MODE_TO_RUN` environment variable for the endpoint is set to `serverless`.

<Note>

For instructions on how to set environment variables during deployment, see [Manage endpoints](/serverless/endpoints/manage-endpoints).

</Note>

3. After your endpoint is deployed, you can test it by [sending API requests](/serverless/endpoints/send-requests).
</Step>
</Steps>

This iterative loop—write your handler, update the Docker image, test in Pod mode, then deploy to Serverless—allows for rapid development and debugging of your Serverless workers.

## Next steps

Now that you've mastered the dual-mode development workflow, you can:

* [Explore advanced handler functions.](/serverless/workers/handler-functions)
* [Learn about sending requests programmatically via API or SDKs.](/serverless/endpoints/send-requests)
* [Understand endpoint configurations for performance and cost optimization.](/serverless/endpoints/endpoint-configurations)
* [Deep dive into local testing and development.](/serverless/development/local-testing)