Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update for local inference demo for LS 0.1 #163

Merged
merged 11 commits into from
Feb 6, 2025
17 changes: 10 additions & 7 deletions examples/ios_calendar_assistant/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@ iOSCalendarAssistant is a demo app ([video](https://drive.google.com/file/d/1xjd

You can also test the create calendar event with a direct ask instead of a detailed meeting note.

# Installation
## Installation

We recommend you try the [iOS Quick Demo](../ios_quick_demo) first to confirm the prerequisite and installation - both demos have the same prerequisite and the first two installation steps.

## Prerequisite
The quickest way to try out the demo for remote inference is using Together.ai's Llama Stack distro at https://llama-stack.together.ai - you can skip the next section and go to the Build and Run the iOS demo section directly.

## (Optional) Build and Run Own Llama Stack Distro

You need to set up a remote Llama Stack distributions to run this demo. Assuming you have a [Fireworks](https://fireworks.ai/account/api-keys) or [Together](https://api.together.ai/) API key, which you can get easily by clicking the link above:

Expand Down Expand Up @@ -39,24 +41,25 @@ The default port is 5000 for `llama stack run` and you can specify a different p

2. Under the iOSCalendarAssistant project - Package Dependencies, click the + sign, then add `https://github.com/meta-llama/llama-stack-client-swift` at the top right and 0.1.0 in the Dependency Rule, then click Add Package.

3. Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro started in Prerequisite:
3. (Optional) Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro in Build and Run Own Llama Stack Distro:

```
private let agent = RemoteAgents(url: URL(string: "http://127.0.0.1:5000")!)
private let agent = RemoteAgents(url: URL(string: "https://llama-stack.together.ai")!)
```

**Note:** In order for the app to access the remote URL, the app's `Info.plist` needs to have the entry `App Transport Security Settings` with `Allow Arbitrary Loads` set to YES.

Also, to allow the app to add event to the Calendar app, the `Info.plist` needs to have an entry `Privacy - Calendars Usage Description` and when running the app for the first time, you need to accept the Calendar access request.

4. Build the run the app on an iOS simulator or your device. First you may try a simple request:

```
Create a calendar event with a meeting title as Llama Stack update for 2-3pm January 27, 2025.
Create a calendar event with a meeting title as Llama Stack update for 2-3pm February 3, 2025.
```

Then, a detailed meeting note:
```
Date: January 20, 2025
Date: February 4, 2025
Time: 10:00 AM - 11:00 AM
Location: Zoom
Attendees:
Expand All @@ -80,7 +83,7 @@ Sarah: Good. Jane, any updates from operations?
Jane: Yes, logistics are sorted, and we’ve confirmed the warehouse availability. The only pending item is training customer support for the new product.
Sarah: Let’s coordinate with the training team to expedite that. Anything else?
Mike: Quick note—can we get feedback on the beta version by Friday?
Sarah: Yes, let’s make that a priority. Anything else? No? Great. Thanks, everyone. Let’s meet again next week from 4-5pm on January 27, 2025 to review progress.
Sarah: Yes, let’s make that a priority. Anything else? No? Great. Thanks, everyone. Let’s meet again next week from 4-5pm on February 11, 2025 to review progress.
```

You'll see a summary, action items and a Calendar event created, made possible by Llama Stack's custom tool calling API support and Llama 3.1's tool calling capability.
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,9 @@ struct ContentView: View {
public init () {
self.inference = LocalInference(queue: runnerQueue)
self.localAgents = LocalAgents(inference: self.inference)
self.remoteAgents = RemoteAgents(url: URL(string: "http://localhost:5000")!)

// replace the URL string if you build and run your own Llama Stack distro as shown in https://github.com/meta-llama/llama-stack-apps/tree/main/examples/ios_calendar_assistant#optional-build-and-run-own-llama-stack-distro
self.remoteAgents = RemoteAgents(url: URL(string: "https://llama-stack.together.ai")!)
}

var agents: Agents {
Expand Down Expand Up @@ -130,39 +132,39 @@ struct ContentView: View {
func summarizeConversation(prompt: String) async {
do {
let request = Components.Schemas.CreateAgentTurnRequest(
agent_id: self.agentId,
messages: [
.UserMessage(Components.Schemas.UserMessage(
content: .case1("Summarize the following conversation in 1-2 sentences:\n\n \(prompt)"),
role: .user
))
],
session_id: self.agenticSystemSessionId,
stream: true
)

for try await chunk in try await self.agents.createTurn(request: request) {
for try await chunk in try await self.agents.createTurn(agent_id: self.agentId, session_id: self.agenticSystemSessionId, request: request) {
let payload = chunk.event.payload
switch (payload) {
case .AgentTurnResponseStepStartPayload(_):
case .step_start(_):
break
case .AgentTurnResponseStepProgressPayload(let step):
if (step.model_response_text_delta != nil) {
case .step_progress(let step):
if (step.delta != nil) {
DispatchQueue.main.async {
withAnimation {
var message = messages.removeLast()
message.text += step.model_response_text_delta!
if case .text(let delta) = step.delta {
message.text += "\(delta.text)"
}
message.tokenCount += 2
message.dateUpdated = Date()
messages.append(message)
}
}
}
case .AgentTurnResponseStepCompletePayload(_):
case .step_complete(_):
break
case .AgentTurnResponseTurnStartPayload(_):
case .turn_start(_):
break
case .AgentTurnResponseTurnCompletePayload(_):
case .turn_complete(_):
break

}
Expand All @@ -175,103 +177,100 @@ struct ContentView: View {

func actionItems(prompt: String) async throws {
let request = Components.Schemas.CreateAgentTurnRequest(
agent_id: self.agentId,
messages: [
.UserMessage(Components.Schemas.UserMessage(
content: .case1("List out any action items based on this text:\n\n \(prompt)"),
role: .user
))
],
session_id: self.agenticSystemSessionId,
stream: true
)

for try await chunk in try await self.agents.createTurn(request: request) {
for try await chunk in try await self.agents.createTurn(agent_id: self.agentId, session_id: self.agenticSystemSessionId, request: request) {
let payload = chunk.event.payload
switch (payload) {
case .AgentTurnResponseStepStartPayload(_):
case .step_start(_):
break
case .AgentTurnResponseStepProgressPayload(let step):
if (step.model_response_text_delta != nil) {
DispatchQueue.main.async {
withAnimation {
var message = messages.removeLast()
message.text += step.model_response_text_delta!
message.tokenCount += 2
message.dateUpdated = Date()
messages.append(message)

self.actionItems += step.model_response_text_delta!
case .step_progress(let step):
DispatchQueue.main.async(execute: DispatchWorkItem {
withAnimation {
var message = messages.removeLast()

if case .text(let delta) = step.delta {
message.text += "\(delta.text)"
self.actionItems += "\(delta.text)"
}
message.tokenCount += 2
message.dateUpdated = Date()
messages.append(message)
}
}
case .AgentTurnResponseStepCompletePayload(_):
})
case .step_complete(_):
break
case .AgentTurnResponseTurnStartPayload(_):
case .turn_start(_):
break
case .AgentTurnResponseTurnCompletePayload(_):
case .turn_complete(_):
break
}
}
}

func callTools(prompt: String) async throws {
let request = Components.Schemas.CreateAgentTurnRequest(
agent_id: self.agentId,
messages: [
.UserMessage(Components.Schemas.UserMessage(
content: .case1("Call functions as needed to handle any actions in the following text:\n\n" + prompt),
role: .user
))
],
session_id: self.agenticSystemSessionId,
stream: true
)

for try await chunk in try await self.agents.createTurn(request: request) {
for try await chunk in try await self.agents.createTurn(agent_id: self.agentId, session_id: self.agenticSystemSessionId, request: request) {
let payload = chunk.event.payload
switch (payload) {
case .AgentTurnResponseStepStartPayload(_):
case .step_start(_):
break
case .AgentTurnResponseStepProgressPayload(let step):
if (step.tool_call_delta != nil) {
switch (step.tool_call_delta!.content) {
case .case1(_):
break
case .ToolCall(let call):
switch (call.tool_name) {
case .BuiltinTool(_):
break
case .case2(let toolName):
if (toolName == "create_event") {
var args: [String : String] = [:]
for (arg_name, arg) in call.arguments.additionalProperties {
switch (arg) {
case .case1(let s): // type string
args[arg_name] = s
case .case2(_), .case3(_), .case4(_), .case5(_), .case6(_):
break
case .step_progress(let step):
switch (step.delta) {
case .tool_call(let call):
if call.parse_status == .succeeded {
switch (call.tool_call) {
case .ToolCall(let toolCall):
var args: [String : String] = [:]
for (arg_name, arg) in toolCall.arguments.additionalProperties {
switch (arg) {
case .case1(let s):
args[arg_name] = s
case .case2(_), .case3(_), .case4(_), .case5(_), .case6(_):
break
}
}
}

let formatter = DateFormatter()
formatter.dateFormat = "yyyy-MM-dd HH:mm"
formatter.timeZone = TimeZone.current
formatter.locale = Locale.current
self.triggerAddEventToCalendar(
title: args["event_name"]!,
startDate: formatter.date(from: args["start"]!) ?? Date(),
endDate: formatter.date(from: args["end"]!) ?? Date()
)
let formatter = DateFormatter()
formatter.dateFormat = "yyyy-MM-dd HH:mm"
formatter.timeZone = TimeZone.current
formatter.locale = Locale.current
self.triggerAddEventToCalendar(
title: args["event_name"]!,
startDate: formatter.date(from: args["start"]!) ?? Date(),
endDate: formatter.date(from: args["end"]!) ?? Date()
)
case .case1(_):
break
}
}
case .text(let text):
break
case .image(_):
break
}
}
case .AgentTurnResponseStepCompletePayload(_):
break
case .AgentTurnResponseTurnStartPayload(_):
case .step_complete(_):
break
case .turn_start(_):
break
case .AgentTurnResponseTurnCompletePayload(_):
case .turn_complete(_):
break
}
}
Expand Down Expand Up @@ -308,22 +307,17 @@ struct ContentView: View {
let createSystemResponse = try await self.agents.create(
request: Components.Schemas.CreateAgentRequest(
agent_config: Components.Schemas.AgentConfig(
client_tools: [ CustomTools.getCreateEventToolForAgent() ],
enable_session_persistence: false,
instructions: "You are a helpful assistant",
max_infer_iters: 1,
model: "Llama3.1-8B-Instruct",
tools: [
Components.Schemas.AgentConfig.toolsPayloadPayload.FunctionCallToolDefinition(
CustomTools.getCreateEventTool()
)
]
model: "meta-llama/Llama-3.1-8B-Instruct"
)
)
)
self.agentId = createSystemResponse.agent_id

let createSessionResponse = try await self.agents.createSession(
request: Components.Schemas.CreateAgentSessionRequest(agent_id: self.agentId, session_name: "llama-assistant")
let createSessionResponse = try await self.agents.createSession(agent_id: self.agentId, request: Components.Schemas.CreateAgentSessionRequest(session_name: "llama-assistant")
)
self.agenticSystemSessionId = createSessionResponse.session_id

Expand Down
8 changes: 5 additions & 3 deletions examples/ios_quick_demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@

iOSQuickDemo is a demo app ([video](https://drive.google.com/file/d/1HnME3VmsYlyeFgsIOMlxZy5c8S2xP4r4/view?usp=sharing)) that shows how to use the Llama Stack Swift SDK ([repo](https://github.com/meta-llama/llama-stack-client-swift)) and its `ChatCompletionRequest` API with a remote Llama Stack server to perform remote inference with Llama 3.1.

# Installation
## Installation

## Prerequisite
The quickest way to try out the demo for remote inference is using Together.ai's Llama Stack distro at https://llama-stack.together.ai - you can skip the next section and go to the Build and Run the iOS demo section directly.

## (Optional) Build and Run Own Llama Stack Distro

You need to set up a remote Llama Stack distributions to run this demo. Assuming you have a [Fireworks](https://fireworks.ai/account/api-keys) or [Together](https://api.together.ai/) API key, which you can get easily by clicking the link above:

Expand Down Expand Up @@ -38,7 +40,7 @@ The default port is 5000 for `llama stack run` and you can specify a different p
![](quick1.png)
![](quick2.png)

3. Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro started in Prerequisite:
3. (Optional) Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro in Build and Run Own Llama Stack Distro:

```
let inference = RemoteInference(url: URL(string: "http://127.0.0.1:5000")!)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ import LlamaStackClient
struct ContentView: View {
@State private var message: String = ""
@State private var userInput: String = "Best quotes in Godfather"

private let runnerQueue = DispatchQueue(label: "org.llamastack.iosquickdemo")

var body: some View {
VStack(spacing: 20) {
Text(message.isEmpty ? "Click Inference to see Llama's answer" : message)
Expand All @@ -24,11 +24,11 @@ struct ContentView: View {
.frame(maxWidth: .infinity)
.background(Color.gray.opacity(0.2))
.cornerRadius(8)

VStack(alignment: .leading, spacing: 10) {
Text("Question")
.font(.headline)

TextField("Enter your question here", text: $userInput)
.textFieldStyle(RoundedBorderTextFieldStyle())
.padding()
Expand Down Expand Up @@ -56,17 +56,19 @@ struct ContentView: View {
message = "Please enter a question before clicking 'Inference'."
return
}

message = ""

let workItem = DispatchWorkItem {
defer {
DispatchQueue.main.async {
}
}

Task {
let inference = RemoteInference(url: URL(string: "http://127.0.0.1:5000")!)

// replace the URL string if you build and run your own Llama Stack distro as shown in https://github.com/meta-llama/llama-stack-apps/tree/main/examples/ios_quick_demo#optional-build-and-run-own-llama-stack-distro
let inference = RemoteInference(url: URL(string: "https://llama-stack.together.ai")!)

do {
for await chunk in try await inference.chatCompletion(
Expand Down Expand Up @@ -108,7 +110,7 @@ struct ContentView: View {
}
}
}

runnerQueue.async(execute: workItem)
}
}