meta-llama · jeffxtang · Feb 6, 2025 · Jan 31, 2025 · Jan 31, 2025 · Feb 4, 2025
@@ -1,14 +1,16 @@
 # iOSCalendarAssistant
 
-iOSCalendarAssistant is a demo app ([video](https://drive.google.com/file/d/1xjdYVm3zDnlxZGi40X_D4IgvmASfG5QZ/view?usp=sharing)) that takes a meeting transcript, summarizes it, extracts action items, and calls tools to book any followup meetings.
+iOSCalendarAssistant is a demo app ([video](https://drive.google.com/file/d/1xjdYVm3zDnlxZGi40X_D4IgvmASfG5QZ/view?usp=sharing)) that uses Llama Stack Swift SDK's remote inference and agent APIs to take a meeting transcript, summarizes it, extracts action items, and calls tools to book any followup meetings.
 
 You can also test the create calendar event with a direct ask instead of a detailed meeting note.
 
-# Installation
+## Installation
 
 We recommend you try the [iOS Quick Demo](../ios_quick_demo) first to confirm the prerequisite and installation - both demos have the same prerequisite and the first two installation steps.
 
-## Prerequisite
+The quickest way to try out the demo for remote inference is using Together.ai's Llama Stack distro at https://llama-stack.together.ai - you can skip the next section and go to the Build and Run the iOS demo section directly.
+
+## (Optional) Build and Run Own Llama Stack Distro
 
 You need to set up a remote Llama Stack distributions to run this demo. Assuming you have a [Fireworks](https://fireworks.ai/account/api-keys) or [Together](https://api.together.ai/) API key, which you can get easily by clicking the link above:
 
@@ -39,24 +41,25 @@ The default port is 5000 for `llama stack run` and you can specify a different p
 
 2. Under the iOSCalendarAssistant project - Package Dependencies, click the + sign, then add `https://github.com/meta-llama/llama-stack-client-swift` at the top right and 0.1.0 in the Dependency Rule, then click Add Package.
 
-3. Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro started in Prerequisite:
+3. (Optional) Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro  in Build and Run Own Llama Stack Distro:
 
 ```
-private let agent = RemoteAgents(url: URL(string: "http://127.0.0.1:5000")!)
+private let agent = RemoteAgents(url: URL(string: "https://llama-stack.together.ai")!)
 ```
+
 **Note:** In order for the app to access the remote URL, the app's `Info.plist` needs to have the entry `App Transport Security Settings` with `Allow Arbitrary Loads` set to YES.
 
 Also, to allow the app to add event to the Calendar app, the `Info.plist` needs to have an entry `Privacy - Calendars Usage Description` and when running the app for the first time, you need to accept the Calendar access request.
 
 4. Build the run the app on an iOS simulator or your device. First you may try a simple request:
 
 ```
-Create a calendar event with a meeting title as Llama Stack update for 2-3pm January 27, 2025.
+Create a calendar event with a meeting title as Llama Stack update for 2-3pm February 3, 2025.
 ```
 
 Then, a detailed meeting note:
 ```
-Date: January 20, 2025
+Date: February 4, 2025
 Time: 10:00 AM - 11:00 AM
 Location: Zoom
 Attendees:
@@ -80,7 +83,37 @@ Sarah: Good. Jane, any updates from operations?
 Jane: Yes, logistics are sorted, and we’ve confirmed the warehouse availability. The only pending item is training customer support for the new product.
 Sarah: Let’s coordinate with the training team to expedite that. Anything else?
 Mike: Quick note—can we get feedback on the beta version by Friday?
-Sarah: Yes, let’s make that a priority. Anything else? No? Great. Thanks, everyone. Let’s meet again next week from 4-5pm on January 27, 2025 to review progress.
+Sarah: Yes, let’s make that a priority. Anything else? No? Great. Thanks, everyone. Let’s meet again next week from 4-5pm on February 11, 2025 to review progress.
 ```
 
 You'll see a summary, action items and a Calendar event created, made possible by Llama Stack's custom tool calling API support and Llama 3.1's tool calling capability.
+
+
+# iOSCalendarAssistantWithLocalInf
+
+iOSCalendarAssistantWithLocalInf is a demo app that uses Llama Stack Swift SDK's local inference and agent APIs and ExecuTorch to run local inference on device.
+
+1. In your work folder, run commands:
+```
+git clone https://github.com/meta-llama/llama-stack-apps
+cd llama-stack-apps
+git submodule update --init --recursive
+```
+
+2. Go back to your work folder, run commands:
+
+```
+git clone https://github.com/meta-llama/llama-stack
+cd llama-stack
+git submodule update --init --recursive
+```
+
+3. Double click `llama-stack-apps/examples/ios_calendar_assistant/iOSCalendarAssistantWithLocalInf.xcodeproj` to open it in Xcode.
+
+4. In the `iOSCalendarAssistantWithLocalInf` project panel, remove `LocalInferenceImpl.xcodeproj` and drag and drop `LocalInferenceImpl.xcodeproj` from `llama-stack/llama_stack/providers/inline/ios/inference` into the `iOSCalendarAssistantWithLocalInf` project.
+
+5. Prepare a Llama model file named `llama3_2_spinquant_oct23.pte` by following the steps [here](https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md#step-2-prepare-model) - you'll also download the `tokenizer.model` file there. Then drag and drop both files to the project `iOSCalendarAssistantWithLocalInf`.
+
+6. Build and run the app on an iOS simulator or a real device.
+
+**Note** If you see a build error about cmake not found, you can install cmake by following the instruction [here](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/apple_ios/LLaMA/docs/delegates/xnnpack_README.md#1-install-cmake).
@@ -39,7 +39,9 @@ struct ContentView: View {
   public init () {
     self.inference = LocalInference(queue: runnerQueue)
     self.localAgents = LocalAgents(inference: self.inference)
-    self.remoteAgents = RemoteAgents(url: URL(string: "http://localhost:5000")!)
+
+    // replace the URL string if you build and run your own Llama Stack distro as shown in https://github.com/meta-llama/llama-stack-apps/tree/main/examples/ios_calendar_assistant#optional-build-and-run-own-llama-stack-distro
+    self.remoteAgents = RemoteAgents(url: URL(string: "https://llama-stack.together.ai")!)
   }
 
   var agents: Agents {
@@ -130,39 +132,39 @@ struct ContentView: View {
   func summarizeConversation(prompt: String) async {
     do {
       let request = Components.Schemas.CreateAgentTurnRequest(
-        agent_id: self.agentId,
         messages: [
           .UserMessage(Components.Schemas.UserMessage(
             content: .case1("Summarize the following conversation in 1-2 sentences:\n\n \(prompt)"),
             role: .user
           ))
         ],
-        session_id: self.agenticSystemSessionId,
         stream: true
       )
 
-      for try await chunk in try await self.agents.createTurn(request: request) {
+      for try await chunk in try await self.agents.createTurn(agent_id: self.agentId, session_id: self.agenticSystemSessionId, request: request) {
         let payload = chunk.event.payload
         switch (payload) {
-        case .AgentTurnResponseStepStartPayload(_):
+        case .step_start(_):
           break
-        case .AgentTurnResponseStepProgressPayload(let step):
-          if (step.model_response_text_delta != nil) {
+        case .step_progress(let step):
+          if (step.delta != nil) {
             DispatchQueue.main.async {
               withAnimation {
                 var message = messages.removeLast()
-                message.text += step.model_response_text_delta!
+                if case .text(let delta) = step.delta {
+                  message.text += "\(delta.text)"
+                }
                 message.tokenCount += 2
                 message.dateUpdated = Date()
                 messages.append(message)
               }
             }
           }
-        case .AgentTurnResponseStepCompletePayload(_):
+        case .step_complete(_):
           break
-        case .AgentTurnResponseTurnStartPayload(_):
+        case .turn_start(_):
           break
-        case .AgentTurnResponseTurnCompletePayload(_):
+        case .turn_complete(_):
           break
 
         }
@@ -175,103 +177,100 @@ struct ContentView: View {
 
   func actionItems(prompt: String) async throws {
     let request = Components.Schemas.CreateAgentTurnRequest(
-      agent_id: self.agentId,
       messages: [
         .UserMessage(Components.Schemas.UserMessage(
           content: .case1("List out any action items based on this text:\n\n \(prompt)"),
           role: .user
         ))
       ],
-      session_id: self.agenticSystemSessionId,
       stream: true
     )
 
-    for try await chunk in try await self.agents.createTurn(request: request) {
+    for try await chunk in try await self.agents.createTurn(agent_id: self.agentId, session_id: self.agenticSystemSessionId, request: request) {
       let payload = chunk.event.payload
       switch (payload) {
-      case .AgentTurnResponseStepStartPayload(_):
+      case .step_start(_):
         break
-      case .AgentTurnResponseStepProgressPayload(let step):
-        if (step.model_response_text_delta != nil) {
-          DispatchQueue.main.async {
-            withAnimation {
-              var message = messages.removeLast()
-              message.text += step.model_response_text_delta!
-              message.tokenCount += 2
-              message.dateUpdated = Date()
-              messages.append(message)
-
-              self.actionItems += step.model_response_text_delta!
+      case .step_progress(let step):
+        DispatchQueue.main.async(execute: DispatchWorkItem {
+          withAnimation {
+            var message = messages.removeLast()
+
+            if case .text(let delta) = step.delta {
+              message.text += "\(delta.text)"
+              self.actionItems += "\(delta.text)"
             }
+            message.tokenCount += 2
+            message.dateUpdated = Date()
+            messages.append(message)
           }
-        }
-      case .AgentTurnResponseStepCompletePayload(_):
+        })
+      case .step_complete(_):
         break
-      case .AgentTurnResponseTurnStartPayload(_):
+      case .turn_start(_):
         break
-      case .AgentTurnResponseTurnCompletePayload(_):
+      case .turn_complete(_):
         break
       }
     }
   }
 
   func callTools(prompt: String) async throws {
     let request = Components.Schemas.CreateAgentTurnRequest(
-      agent_id: self.agentId,
       messages: [
         .UserMessage(Components.Schemas.UserMessage(
           content: .case1("Call functions as needed to handle any actions in the following text:\n\n" + prompt),
           role: .user
         ))
       ],
-      session_id: self.agenticSystemSessionId,
       stream: true
     )
 
-    for try await chunk in try await self.agents.createTurn(request: request) {
+    for try await chunk in try await self.agents.createTurn(agent_id: self.agentId, session_id: self.agenticSystemSessionId, request: request) {
       let payload = chunk.event.payload
       switch (payload) {
-      case .AgentTurnResponseStepStartPayload(_):
+      case .step_start(_):
         break
-      case .AgentTurnResponseStepProgressPayload(let step):
-        if (step.tool_call_delta != nil) {
-          switch (step.tool_call_delta!.content) {
-          case .case1(_):
-            break
-          case .ToolCall(let call):
-            switch (call.tool_name) {
-            case .BuiltinTool(_):
-              break
-            case .case2(let toolName):
-              if (toolName == "create_event") {
-                var args: [String : String] = [:]
-                for (arg_name, arg) in call.arguments.additionalProperties {
-                  switch (arg) {
-                  case .case1(let s): // type string
-                    args[arg_name] = s
-                  case .case2(_), .case3(_), .case4(_), .case5(_), .case6(_):
-                    break
+      case .step_progress(let step):
+          switch (step.delta) {
+          case .tool_call(let call):
+            if call.parse_status == .succeeded {
+              switch (call.tool_call) {
+              case .ToolCall(let toolCall):
+                  var args: [String : String] = [:]
+                  for (arg_name, arg) in toolCall.arguments.additionalProperties {
+                    switch (arg) {
+                    case .case1(let s):
+                      args[arg_name] = s
+                    case .case2(_), .case3(_), .case4(_), .case5(_), .case6(_):
+                      break
+                    }
                   }
-                }
 
-                let formatter = DateFormatter()
-                formatter.dateFormat = "yyyy-MM-dd HH:mm"
-                formatter.timeZone = TimeZone.current
-                formatter.locale = Locale.current
-                self.triggerAddEventToCalendar(
-                  title: args["event_name"]!,
-                  startDate: formatter.date(from: args["start"]!) ?? Date(),
-                  endDate: formatter.date(from: args["end"]!) ?? Date()
-                )
+                  let formatter = DateFormatter()
+                  formatter.dateFormat = "yyyy-MM-dd HH:mm"
+                  formatter.timeZone = TimeZone.current
+                  formatter.locale = Locale.current
+                  self.triggerAddEventToCalendar(
+                    title: args["event_name"]!,
+                    startDate: formatter.date(from: args["start"]!) ?? Date(),
+                    endDate: formatter.date(from: args["end"]!) ?? Date()
+                  )
+              case .case1(_):
+                break
               }
             }
+          case .text(let text):
+            break
+          case .image(_):
+            break
           }
-        }
-      case .AgentTurnResponseStepCompletePayload(_):
         break
-      case .AgentTurnResponseTurnStartPayload(_):
+      case .step_complete(_):
+        break
+      case .turn_start(_):
         break
-      case .AgentTurnResponseTurnCompletePayload(_):
+      case .turn_complete(_):
         break
       }
     }
@@ -308,22 +307,17 @@ struct ContentView: View {
           let createSystemResponse = try await self.agents.create(
             request: Components.Schemas.CreateAgentRequest(
               agent_config: Components.Schemas.AgentConfig(
+                client_tools: [ CustomTools.getCreateEventToolForAgent() ],
                 enable_session_persistence: false,
                 instructions: "You are a helpful assistant",
                 max_infer_iters: 1,
-                model: "Llama3.1-8B-Instruct",
-                tools: [
-                 Components.Schemas.AgentConfig.toolsPayloadPayload.FunctionCallToolDefinition(
-                   CustomTools.getCreateEventTool()
-                 )
-                ]
+                model: "meta-llama/Llama-3.1-8B-Instruct"
               )
             )
           )
           self.agentId = createSystemResponse.agent_id
 
-          let createSessionResponse = try await self.agents.createSession(
-            request: Components.Schemas.CreateAgentSessionRequest(agent_id: self.agentId, session_name: "llama-assistant")
+          let createSessionResponse = try await self.agents.createSession(agent_id: self.agentId, request: Components.Schemas.CreateAgentSessionRequest(session_name: "llama-assistant")
           )
           self.agenticSystemSessionId = createSessionResponse.session_id
 

@@ -2,9 +2,11 @@
 
 iOSQuickDemo is a demo app ([video](https://drive.google.com/file/d/1HnME3VmsYlyeFgsIOMlxZy5c8S2xP4r4/view?usp=sharing)) that shows how to use the Llama Stack Swift SDK ([repo](https://github.com/meta-llama/llama-stack-client-swift)) and its `ChatCompletionRequest` API with a remote Llama Stack server to perform remote inference with Llama 3.1.
 
-# Installation
+## Installation
 
-## Prerequisite
+The quickest way to try out the demo for remote inference is using Together.ai's Llama Stack distro at https://llama-stack.together.ai - you can skip the next section and go to the Build and Run the iOS demo section directly.
+
+## (Optional) Build and Run Own Llama Stack Distro
 
 You need to set up a remote Llama Stack distributions to run this demo. Assuming you have a [Fireworks](https://fireworks.ai/account/api-keys) or [Together](https://api.together.ai/) API key, which you can get easily by clicking the link above:
 
@@ -38,7 +40,7 @@ The default port is 5000 for `llama stack run` and you can specify a different p
 ![](quick1.png)
 ![](quick2.png)
 
-3. Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro started in Prerequisite:
+3. (Optional) Replace the `RemoteInference` url string in `ContentView.swift` below with the host IP and port of the remote Llama Stack distro in Build and Run Own Llama Stack Distro:
 
 ```
 let inference = RemoteInference(url: URL(string: "http://127.0.0.1:5000")!)