Skip to content

added ollama support with configurable base URL #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,25 @@ const zee = new ZeeWorkflow({
console.log(result);
})();
```
The agent will automatically handle streaming responses and format them appropriately.

### Using Ollama with the SDK

To use Ollama with the SDK, you need to configure the Ollama provider in your agent setup.

```js
const agent = new Agent({
name: "LocalAgent",
model: {
provider: "OLLAMA",
name: "llama2", // use the model you pulled
baseURL: "http://localhost:11434", // optional
},
description: "A locally-running AI assistant",
instructions:["answer the users questions "]
});
```


## 🤝 Contributing

Expand Down
36 changes: 35 additions & 1 deletion docs/concepts/llms.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ const llm = new LLM({
provider: "OPEN_AI",
name: "gpt-4o-mini",
apiKey: process.env.OPENAI_API_KEY,
temperature: 0.7
temperature: 0.7,
});
```

Expand Down Expand Up @@ -131,3 +131,37 @@ const messages = [
```

Note: Image analysis is currently supported by OpenAI models. Some providers like Gemini may have limited or no support for image analysis.

## Experimental Features

### Ollama Integration

Ollama integration enables you to use locally hosted language models through Ollama's API. This experimental feature provides a way to run AI models on your own infrastructure.

#### Basic Usage

```tsx
const llm = new LLM({
provider: "OLLAMA",
name: "llama2", // or any model you've pulled in Ollama
baseURL: "http://localhost:11434", // optional, defaults to this
});
```


### Supported Models

```plaintext Ollama
Any model name that you have pulled in Ollama
Examples:
"llama2"
"codellama"
"mistral"
"neural-chat"
```

### Environment Variables

```plaintext Ollama
OLLAMA_BASE_URL=http://localhost:11434 # Optional
```
1 change: 1 addition & 0 deletions packages/ai-agent-sdk/src/core/llm/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
export * from "./llm";
export * from "./llm.types";

109 changes: 108 additions & 1 deletion packages/ai-agent-sdk/src/core/llm/llm.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
import { Base } from "../base";
import type { Tool } from "../tools/base";
import type { FunctionToolCall, LLMResponse, ModelConfig } from "./llm.types";
import type {
FunctionToolCall,
LLMResponse,
ModelConfig,
OllamaResponse,
FormattedResponse,
} from "./llm.types";
import OpenAI from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import type {
Expand Down Expand Up @@ -83,6 +89,107 @@ export class LLM extends Base {
config.apiKey =
process.env["GEMINI_API_KEY"] || this.model.apiKey;
break;
case "OLLAMA": {
config.baseURL =
process.env["OLLAMA_BASE_URL"] ||
this.model.baseURL ||
"http://localhost:11434";

try {
const response = await fetch(`${config.baseURL}/api/chat`, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
model: this.model.name,
messages: messages.map((msg) => ({
role: msg.role,
content:
typeof msg.content === "string"
? msg.content
: "",
})),
stream: true,
}),
});

if (!response.ok || !response.body) {
throw new Error(
`Ollama API error: ${response.statusText}`
);
}

const reader = response.body.getReader();
let fullContent = "";

try {
while (true) {
const { done, value } = await reader.read();
if (done) break;

const chunk = new TextDecoder().decode(value);
const lines = chunk.split("\n");

for (const line of lines) {
if (line.trim()) {
try {
const json = JSON.parse(
line
) as OllamaResponse;
if (json.message?.content) {
fullContent += json.message.content;
}
} catch (e) {
// ignore parse errors
}
}
}
}
} finally {
reader.releaseLock();
}

const formatResponse = (
content: string
): FormattedResponse => {
const parts = content.split("</think>");
const thinking =
parts.length > 1 && parts[0]
? parts[0].replace("<think>", "").trim()
: "";
const response =
parts.length > 1
? parts[1]?.trim() || content.trim()
: content.trim();

return { thinking, response };
};

const formattedContent = formatResponse(fullContent);

return {
type: "content" as keyof T,
value: {
result: {
role: "assistant",
content: {
thinking: formattedContent.thinking,
answer: formattedContent.response,
},
},
status: "finished",
children: [],
},
} as LLMResponse<T>;
} catch (error) {
const errorMessage =
error instanceof Error
? error.message
: "Unknown error";
throw new Error(`Ollama API error: ${errorMessage}`);
}
}
default:
var _exhaustiveCheck: never = provider;
throw new Error(
Expand Down
30 changes: 29 additions & 1 deletion packages/ai-agent-sdk/src/core/llm/llm.types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,39 @@ export type GeminiConfig = {
apiKey?: string;
};

export type OllamaModel = string; // Ollama model is not fixed

export type OllamaConfig = {
provider: "OLLAMA";
name: OllamaModel;
toolChoice?: "auto" | "required";
temperature?: number;
apiKey?: string;
baseURL?: string; // option to override the base url
};

export type ModelConfig =
| OpenAIConfig
| DeepSeekConfig
| GrokConfig
| GeminiConfig;
| GeminiConfig
| OllamaConfig;

export interface OllamaMessage {
role: string;
content: string;
}

export interface OllamaResponse {
model: string;
message?: OllamaMessage;
done: boolean;
}

export interface FormattedResponse {
thinking: string;
response: string;
}

export type LLMResponse<T extends Record<string, AnyZodObject>> = {
[K in keyof T]: {
Expand Down