Skip to content

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

License

Notifications You must be signed in to change notification settings

ahasasjeb/openai-swarm

 
 

Repository files navigation

Swarm Logo Change Summary: Added await asyncio.sleep(0) in the run_and_stream method to release the event loop after each chunk is processed, ensuring non-blocking streaming for WebSocket responses without altering existing logic.

Front end and back-end examples using FastAPI Place the HTML in the folder named 'templates' and name it' index. html '

pip install fastapi starlette jinja2 uvicorn

#By GPT-4o
import asyncio
import os

from starlette.websockets import WebSocketDisconnect

from fastapi import FastAPI, Request, WebSocket
from fastapi.responses import HTMLResponse
from fastapi.templating import Jinja2Templates
from swarm import Swarm, Agent

app = FastAPI()
templates = Jinja2Templates(directory="templates")

# 初始化 Swarm 客户端和 Agent Initialize Swarm client and agent
client = Swarm()
agent_1 = Agent(name="Agent 1 coder")

# HTTP 路由:渲染 HTML 页面 HTTP routing: rendering HTML pages
@app.get("/", response_class=HTMLResponse)
async def get(request: Request):
    return templates.TemplateResponse("index.html", {"request": request})

# WebSocket 路由:处理前端消息并响应 WebSocket Routing: Processing Front End Messages and Responding
async def async_generator_wrapper(generator):
    for item in generator:
        yield item
        await asyncio.sleep(0)  # 确保事件循环不会被阻塞 Ensure that the event loop is not blocked

@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            data = await websocket.receive_text()
            print(f"Received message from client: {data}")

            stream = client.run(
                agent=agent_1,
                messages=[{"role": "user", "content": data}],
                model_override="gpt-4o-mini",
                stream=True,
            )

            async for chunk in async_generator_wrapper(stream):
                if 'content' in chunk and chunk["content"]:
                    content = chunk["content"]
                    print(f"Sending chunk: {content}")
                    await websocket.send_text(content)  # 实时发送数据块 Real time transmission of data blocks

    except WebSocketDisconnect:
        print("WebSocket disconnected")
    except Exception as e:
        print(f"Unexpected error: {str(e)}")
        await websocket.send_text(f"Error: {str(e)}")
    finally:
        print("Closing connection")
        await websocket.close()



if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="127.0.0.1", port=8008)
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Agent live chat</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background-color: #f4f4f4;
            display: flex;
            justify-content: center;
            align-items: center;
            height: 100vh;
            margin: 0;
        }

        .chat-container {
            width: 80%;
            max-width: 600px;
            background-color: white;
            padding: 20px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);
            border-radius: 10px;
            display: flex;
            flex-direction: column;
            height: 100%;
        }

        h1 {
            text-align: center;
        }

        #agent-log {
            width: 100%;
            height: 300px;
            margin-bottom: 10px;
            overflow-y: auto;
            padding: 10px;
            border: 1px solid #ddd;
            border-radius: 5px;
            background-color: #f9f9f9;
            font-weight: bold;
        }

        #log {
            width: 100%;
            height: 100px;
            margin-bottom: 10px;
            overflow-y: auto;
            padding: 10px;
            border: 1px solid #ddd;
            border-radius: 5px;
            background-color: #f9f9f9;
        }

        #message {
            width: calc(100% - 100px);
            padding: 10px;
            margin-right: 10px;
            border: 1px solid #ddd;
            border-radius: 5px;
        }

        button {
            width: 80px;
            padding: 10px;
            background-color: #4CAF50;
            color: white;
            border: none;
            border-radius: 5px;
            cursor: pointer;
        }

        button:hover {
            background-color: #45a049;
        }

        .markdown {
            /* 用于标记解析的样式 */
            /*Style used for tag parsing*/
            overflow-wrap: break-word;
        }
    </style>

    <!-- 引入 marked.js --><!--  Introducing Marked-js -->
    <script src="https://cdn.bootcdn.net/ajax/libs/marked/2.0.1/marked.min.js"></script>

    <script>
        let agentMessage = ""; // 存储 Agent 的完整响应 //Complete response of storage agent

        // 建立 WebSocket 连接并发送消息 //Establish WebSocket connection and send messages
        function sendMessage() {
            const input = document.getElementById("message");
            const message = input.value.trim();
            if (!message) return;  // 避免发送空消息 //Avoid sending empty messages

            logMessage("你", message);
            agentMessage = ""; // 清空之前的 Agent 消息 //Clear previous Agent messages
            input.value = "";

            const websocket = new WebSocket("ws://localhost:8008/ws");

            websocket.onopen = function() {
                websocket.send(message);
            };

            websocket.onmessage = function(event) {
                if (event.data && event.data !== "None") {
                    agentMessage += event.data; // 逐步累积消息
                    updateAgentMessage();
                }
            };
        }

        // 更新页面中的 Agent 响应行
//Update the Agent response line on the page
        function updateAgentMessage() {
            const agentLog = document.getElementById("agent-log");
            agentLog.innerHTML = marked(agentMessage); // 使用 marked 解析 Markdown //Using marked to parse Markdown
        }

        // 日志输出到页面
//Log output to page
        function logMessage(sender, message) {
            const log = document.getElementById("log");
            const newMessage = document.createElement("div");
            newMessage.textContent = `${sender}: ${message}`;
            log.appendChild(newMessage);
            log.scrollTop = log.scrollHeight; // 滚动到日志底部   //Scroll to the bottom of the log
        }
    </script>
</head>
<body>
    <div class="chat-container">
        <h1>Real time chat with Agent</h1>
        <div id="agent-log"></div>
        <div id="log"></div>
        <input type="text" id="message" placeholder="Enter your message..." />
        <button onclick="sendMessage()">sending</button>
    </div>
</body>
</html>
<!-- By GPT-4o -->

Swarm (experimental, educational)

An educational framework exploring ergonomic, lightweight multi-agent orchestration.

Warning

Swarm is currently an experimental sample framework intended to explore ergonomic interfaces for multi-agent systems. It is not intended to be used in production, and therefore has no official support. (This also means we will not be reviewing PRs or issues!)

The primary goal of Swarm is to showcase the handoff & routines patterns explored in the Orchestrating Agents: Handoffs & Routines cookbook. It is not meant as a standalone library, and is primarily for educational purposes.

Install

Requires Python 3.10+

pip install git+ssh://[email protected]/ahasasjeb/openai-swarm.git

or

pip install git+https://github.com/ahasasjeb/openai-swarm.git

Usage

from swarm import Swarm, Agent

client = Swarm()

def transfer_to_agent_b():
    return agent_b


agent_a = Agent(
    name="Agent A",
    instructions="You are a helpful agent.",
    functions=[transfer_to_agent_b],
)

agent_b = Agent(
    name="Agent B",
    instructions="Only speak in Haikus.",
)

response = client.run(
    agent=agent_a,
    messages=[{"role": "user", "content": "I want to talk to agent B."}],
)

print(response.messages[-1]["content"])
Hope glimmers brightly,
New paths converge gracefully,
What can I assist?

Table of Contents

Overview

Swarm focuses on making agent coordination and execution lightweight, highly controllable, and easily testable.

It accomplishes this through two primitive abstractions: Agents and handoffs. An Agent encompasses instructions and tools, and can at any point choose to hand off a conversation to another Agent.

These primitives are powerful enough to express rich dynamics between tools and networks of agents, allowing you to build scalable, real-world solutions while avoiding a steep learning curve.

Note

Swarm Agents are not related to Assistants in the Assistants API. They are named similarly for convenience, but are otherwise completely unrelated. Swarm is entirely powered by the Chat Completions API and is hence stateless between calls.

Why Swarm

Swarm explores patterns that are lightweight, scalable, and highly customizable by design. Approaches similar to Swarm are best suited for situations dealing with a large number of independent capabilities and instructions that are difficult to encode into a single prompt.

The Assistants API is a great option for developers looking for fully-hosted threads and built in memory management and retrieval. However, Swarm is an educational resource for developers curious to learn about multi-agent orchestration. Swarm runs (almost) entirely on the client and, much like the Chat Completions API, does not store state between calls.

Examples

Check out /examples for inspiration! Learn more about each one in its README.

  • basic: Simple examples of fundamentals like setup, function calling, handoffs, and context variables
  • triage_agent: Simple example of setting up a basic triage step to hand off to the right agent
  • weather_agent: Simple example of function calling
  • airline: A multi-agent setup for handling different customer service requests in an airline context.
  • support_bot: A customer service bot which includes a user interface agent and a help center agent with several tools
  • personal_shopper: A personal shopping agent that can help with making sales and refunding orders

Documentation

Swarm Diagram

Running Swarm

Start by instantiating a Swarm client (which internally just instantiates an OpenAI client).

from swarm import Swarm

client = Swarm()

client.run()

Swarm's run() function is analogous to the chat.completions.create() function in the Chat Completions API – it takes messages and returns messages and saves no state between calls. Importantly, however, it also handles Agent function execution, hand-offs, context variable references, and can take multiple turns before returning to the user.

At its core, Swarm's client.run() implements the following loop:

  1. Get a completion from the current Agent
  2. Execute tool calls and append results
  3. Switch Agent if necessary
  4. Update context variables, if necessary
  5. If no new function calls, return

Arguments

Argument Type Description Default
agent Agent The (initial) agent to be called. (required)
messages List A list of message objects, identical to Chat Completions messages (required)
context_variables dict A dictionary of additional context variables, available to functions and Agent instructions {}
max_turns int The maximum number of conversational turns allowed float("inf")
model_override str An optional string to override the model being used by an Agent None
execute_tools bool If False, interrupt execution and immediately returns tool_calls message when an Agent tries to call a function True
stream bool If True, enables streaming responses False
debug bool If True, enables debug logging False

Once client.run() is finished (after potentially multiple calls to agents and tools) it will return a Response containing all the relevant updated state. Specifically, the new messages, the last Agent to be called, and the most up-to-date context_variables. You can pass these values (plus new user messages) in to your next execution of client.run() to continue the interaction where it left off – much like chat.completions.create(). (The run_demo_loop function implements an example of a full execution loop in /swarm/repl/repl.py.)

Response Fields

Field Type Description
messages List A list of message objects generated during the conversation. Very similar to Chat Completions messages, but with a sender field indicating which Agent the message originated from.
agent Agent The last agent to handle a message.
context_variables dict The same as the input variables, plus any changes.

Agents

An Agent simply encapsulates a set of instructions with a set of functions (plus some additional settings below), and has the capability to hand off execution to another Agent.

While it's tempting to personify an Agent as "someone who does X", it can also be used to represent a very specific workflow or step defined by a set of instructions and functions (e.g. a set of steps, a complex retrieval, single step of data transformation, etc). This allows Agents to be composed into a network of "agents", "workflows", and "tasks", all represented by the same primitive.

Agent Fields

Field Type Description Default
name str The name of the agent. "Agent"
model str The model to be used by the agent. "gpt-4o"
instructions str or func() -> str Instructions for the agent, can be a string or a callable returning a string. "You are a helpful agent."
functions List A list of functions that the agent can call. []
tool_choice str The tool choice for the agent, if any. None

Instructions

Agent instructions are directly converted into the system prompt of a conversation (as the first message). Only the instructions of the active Agent will be present at any given time (e.g. if there is an Agent handoff, the system prompt will change, but the chat history will not.)

agent = Agent(
   instructions="You are a helpful agent."
)

The instructions can either be a regular str, or a function that returns a str. The function can optionally receive a context_variables parameter, which will be populated by the context_variables passed into client.run().

def instructions(context_variables):
   user_name = context_variables["user_name"]
   return f"Help the user, {user_name}, do whatever they want."

agent = Agent(
   instructions=instructions
)
response = client.run(
   agent=agent,
   messages=[{"role":"user", "content": "Hi!"}],
   context_variables={"user_name":"John"}
)
print(response.messages[-1]["content"])
Hi John, how can I assist you today?

Functions

  • Swarm Agents can call python functions directly.
  • Function should usually return a str (values will be attempted to be cast as a str).
  • If a function returns an Agent, execution will be transferred to that Agent.
  • If a function defines a context_variables parameter, it will be populated by the context_variables passed into client.run().
def greet(context_variables, language):
   user_name = context_variables["user_name"]
   greeting = "Hola" if language.lower() == "spanish" else "Hello"
   print(f"{greeting}, {user_name}!")
   return "Done"

agent = Agent(
   functions=[greet]
)

client.run(
   agent=agent,
   messages=[{"role": "user", "content": "Usa greet() por favor."}],
   context_variables={"user_name": "John"}
)
Hola, John!
  • If an Agent function call has an error (missing function, wrong argument, error) an error response will be appended to the chat so the Agent can recover gracefully.
  • If multiple functions are called by the Agent, they will be executed in that order.

Handoffs and Updating Context Variables

An Agent can hand off to another Agent by returning it in a function.

sales_agent = Agent(name="Sales Agent")

def transfer_to_sales():
   return sales_agent

agent = Agent(functions=[transfer_to_sales])

response = client.run(agent, [{"role":"user", "content":"Transfer me to sales."}])
print(response.agent.name)
Sales Agent

It can also update the context_variables by returning a more complete Result object. This can also contain a value and an agent, in case you want a single function to return a value, update the agent, and update the context variables (or any subset of the three).

sales_agent = Agent(name="Sales Agent")

def talk_to_sales():
   print("Hello, World!")
   return Result(
       value="Done",
       agent=sales_agent,
       context_variables={"department": "sales"}
   )

agent = Agent(functions=[talk_to_sales])

response = client.run(
   agent=agent,
   messages=[{"role": "user", "content": "Transfer me to sales"}],
   context_variables={"user_name": "John"}
)
print(response.agent.name)
print(response.context_variables)
Sales Agent
{'department': 'sales', 'user_name': 'John'}

Note

If an Agent calls multiple functions to hand-off to an Agent, only the last handoff function will be used.

Function Schemas

Swarm automatically converts functions into a JSON Schema that is passed into Chat Completions tools.

  • Docstrings are turned into the function description.
  • Parameters without default values are set to required.
  • Type hints are mapped to the parameter's type (and default to string).
  • Per-parameter descriptions are not explicitly supported, but should work similarly if just added in the docstring. (In the future docstring argument parsing may be added.)
def greet(name, age: int, location: str = "New York"):
   """Greets the user. Make sure to get their name and age before calling.

   Args:
      name: Name of the user.
      age: Age of the user.
      location: Best place on earth.
   """
   print(f"Hello {name}, glad you are {age} in {location}!")
{
   "type": "function",
   "function": {
      "name": "greet",
      "description": "Greets the user. Make sure to get their name and age before calling.\n\nArgs:\n   name: Name of the user.\n   age: Age of the user.\n   location: Best place on earth.",
      "parameters": {
         "type": "object",
         "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer"},
            "location": {"type": "string"}
         },
         "required": ["name", "age"]
      }
   }
}

Streaming

stream = client.run(agent, messages, stream=True)
for chunk in stream:
   print(chunk)

Uses the same events as Chat Completions API streaming. See process_and_print_streaming_response in /swarm/repl/repl.py as an example.

Two new event types have been added:

  • {"delim":"start"} and {"delim":"end"}, to signal each time an Agent handles a single message (response or function call). This helps identify switches between Agents.
  • {"response": Response} will return a Response object at the end of a stream with the aggregated (complete) response, for convenience.

Evaluations

Evaluations are crucial to any project, and we encourage developers to bring their own eval suites to test the performance of their swarms. For reference, we have some examples for how to eval swarm in the airline, weather_agent and triage_agent quickstart examples. See the READMEs for more details.

Utils

Use the run_demo_loop to test out your swarm! This will run a REPL on your command line. Supports streaming.

from swarm.repl import run_demo_loop
...
run_demo_loop(agent, stream=True)

Core Contributors

About

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%