braintrustdata · edenh · Sep 29, 2025 · Oct 3, 2025 · Oct 3, 2025 · Oct 3, 2025
diff --git a/examples/APIAgent-Py/APIAgent.ipynb b/examples/APIAgent-Py/APIAgent.ipynb
@@ -826,7 +826,7 @@
    "source": [
     "Awesome! The logs now have a `no_hallucination` score which we can use to filter down hallucinations.\n",
     "\n",
-    "![Hallucination logs](./assets/logs-with-score.gif)\n"
+    "![Hallucination logs](./assets/logs-with-score.mp4)\n"
    ]
   },
   {
@@ -839,7 +839,7 @@
     "non-hallucinations are correct, but in a real-world scenario, you could [collect user feedback](https://www.braintrust.dev/docs/guides/logging#user-feedback)\n",
     "and treat positively rated feedback as ground truth.\n",
     "\n",
-    "![Dataset setup](./assets/dataset-setup.gif)\n",
+    "![Dataset setup](./assets/dataset-setup.mp4)\n",
     "\n",
     "## Running evals\n",
     "\n",
@@ -1020,7 +1020,7 @@
     "\n",
     "To understand why, we can filter down to this regression, and take a look at a side-by-side diff.\n",
     "\n",
-    "![Regression diff](./assets/regression-diff.gif)\n",
+    "![Regression diff](./assets/regression-diff.mp4)\n",
     "\n",
     "Does it matter whether or not the model generates these fields? That's a good question and something you can work on as a next step.\n",
     "Maybe you should tweak how Factuality works, or change the prompt to always return a consistent set of fields.\n",

diff --git a/examples/APIAgent-Py/assets/dataset-setup.gif b/examples/APIAgent-Py/assets/dataset-setup.gif
diff --git a/examples/APIAgent-Py/assets/dataset-setup.mp4 b/examples/APIAgent-Py/assets/dataset-setup.mp4
diff --git a/examples/APIAgent-Py/assets/logs-with-score.gif b/examples/APIAgent-Py/assets/logs-with-score.gif
diff --git a/examples/APIAgent-Py/assets/logs-with-score.mp4 b/examples/APIAgent-Py/assets/logs-with-score.mp4
diff --git a/examples/APIAgent-Py/assets/regression-diff.gif b/examples/APIAgent-Py/assets/regression-diff.gif
diff --git a/examples/APIAgent-Py/assets/regression-diff.mp4 b/examples/APIAgent-Py/assets/regression-diff.mp4
diff --git a/examples/ClassifyingNewsArticles/ClassifyingNewsArticles.ipynb b/examples/ClassifyingNewsArticles/ClassifyingNewsArticles.ipynb
@@ -423,7 +423,7 @@
     "- You should see the eval scores increase and you can see which test cases improved.\n",
     "- You can also filter the test cases by improvements to know exactly why the scores changed.\n",
     "\n",
-    "![Compare](assets/inspect.gif)\n",
+    "![Compare](assets/inspect.mp4)\n",
     "\n"
    ]
   },

diff --git a/examples/ClassifyingNewsArticles/assets/inspect.gif b/examples/ClassifyingNewsArticles/assets/inspect.gif
diff --git a/examples/ClassifyingNewsArticles/assets/inspect.mp4 b/examples/ClassifyingNewsArticles/assets/inspect.mp4
diff --git a/examples/Github-Issues/Github-Issues.ipynb b/examples/Github-Issues/Github-Issues.ipynb
@@ -482,7 +482,7 @@
     "\n",
     "Happy evaluating!\n",
     "\n",
-    "![improvements](./assets/improvements.gif)\n"
+    "![improvements](./assets/improvements.mp4)\n"
    ]
   }
  ],

diff --git a/examples/Github-Issues/assets/improvements.gif b/examples/Github-Issues/assets/improvements.gif
diff --git a/examples/Github-Issues/assets/improvements.mp4 b/examples/Github-Issues/assets/improvements.mp4
diff --git a/examples/LLaMa-3_1-Tools/LLaMa-3_1-Tools.ipynb b/examples/LLaMa-3_1-Tools/LLaMa-3_1-Tools.ipynb
@@ -756,7 +756,7 @@
     "\n",
     "Although it's a fraction of the cost, it's both slower (likely due to rate limits) and worse performing than GPT-4o. 12 of the 60 cases failed to parse. Let's take a look at one of those in depth.\n",
     "\n",
-    "![parsing-failure](./assets/parsing-failure.gif)\n",
+    "![parsing-failure](./assets/parsing-failure.mp4)\n",
     "\n",
     "That definitely looks like an invalid tool call. Maybe we can experiment with tweaking the prompt to get better results.\n",
     "\n",

diff --git a/examples/LLaMa-3_1-Tools/assets/parsing-failure.gif b/examples/LLaMa-3_1-Tools/assets/parsing-failure.gif
diff --git a/examples/LLaMa-3_1-Tools/assets/parsing-failure.mp4 b/examples/LLaMa-3_1-Tools/assets/parsing-failure.mp4
diff --git a/examples/OTEL-logging/assets/add-post-filter.gif b/examples/OTEL-logging/assets/add-post-filter.gif
diff --git a/examples/OTEL-logging/assets/add-post-filter.mp4 b/examples/OTEL-logging/assets/add-post-filter.mp4
diff --git a/examples/OTEL-logging/assets/otel-demo.gif b/examples/OTEL-logging/assets/otel-demo.gif
diff --git a/examples/OTEL-logging/assets/otel-demo.mp4 b/examples/OTEL-logging/assets/otel-demo.mp4
diff --git a/examples/OTEL-logging/assets/spans.gif b/examples/OTEL-logging/assets/spans.gif
diff --git a/examples/OTEL-logging/assets/spans.mp4 b/examples/OTEL-logging/assets/spans.mp4
diff --git a/examples/OTEL-logging/otel-logging.mdx b/examples/OTEL-logging/otel-logging.mdx
@@ -141,19 +141,19 @@ Run `npm install` to install the required dependencies, then `npm run dev` to la
 
 Open your Braintrust project to the **Logs** page, and select **What orders have shipped?** in your applications. You should be able to watch the logs filter in as your application makes HTTP requests and LLM calls.
 
-![LLM calls and logs side by side](assets/otel-demo.gif)
+![LLM calls and logs side by side](assets/otel-demo.mp4)
 
 Because this application is using multi-step streaming and tool calls, the logs are especially interesting. In Braintrust, logs consist of [traces](/docs/guides/traces), which roughly correspond to a single request or interaction in your application. Traces consist of one or more spans, each of which corresponds to a unit of work in your application. In this example, each step and tool call is logged inside of its own span. This level of granularity makes it easier to debug issues, track user behavior, and collect data into datasets.
 
 ### Filtering your logs
 
 Run a couple more queries in the app and notice the logs that are generated. Our app is logging both `GET` and `POST` requests, but we’re most interested in the `POST` requests since they contain our LLM calls. We can apply a filter using the [BTQL](/docs/reference/btql) query `Name LIKE 'POST%'` so that we only see the traces we care about:
 
-![Filter using BTQL](assets/add-post-filter.gif)
+![Filter using BTQL](assets/add-post-filter.mp4)
 
 You should now have a list of traces for all the `POST` requests your app has made. Each contains the inputs and outputs of each LLM call in a span called `ai.streamText`. If you go further into the trace, you’ll also notice a span for each tool call.
 
-![Expanding tool call and stream spans](assets/spans.gif)
+![Expanding tool call and stream spans](assets/spans.mp4)
 
 This is valuable data that can be used to evaluate the quality of accuracy of your application in Braintrust.
 

diff --git a/examples/PDFPlayground/PDFPlayground.mdx b/examples/PDFPlayground/PDFPlayground.mdx
@@ -348,7 +348,7 @@ Once your traces have been logged, you can use the Braintrust UI to manage your
 
 You can store the user spans from your PDF traces into a dataset. Select the span, and then select **Add span to dataset**, or use the hotkey `D` to speed this up.
 
-![add span to dataset](./assets/add-span-to-dataset.gif)
+![add span to dataset](./assets/add-span-to-dataset.mp4)
 
 ### Trying system prompts in a playground
 
@@ -357,21 +357,21 @@ Select a system prompt span, and then select **Try prompt** to:
 1. Save the prompt (for example, "system1") to your library by selecting **Save as custom prompt**
 2. Launch a playground using the saved prompt by selecting **Create playground with prompt**
 
-![try prompt from span](./assets/try-prompt.gif)
+![try prompt from span](./assets/try-prompt.mp4)
 
 ### File attachment methods
 
 There are two ways to attach PDF files in playgrounds: using the paperclip button in the UI, or specifying a public URL. Let's walk through each method:
 
 - To upload files directly from your local machine, start by selecting **+ Message** to add a user prompt. Then, select **+ Message Part** > **File**. This will display a paperclip icon on the right side. Select it to upload a file from your local machine.
 
-![paperclip UI method](./assets/paperclip.gif)
+![paperclip UI method](./assets/paperclip.mp4)
 
 This method is particularly useful when you're working with local files that aren't accessible via public URL.
 
 - To use the public URL method, paste the URL directly into the file message input field. You can also use mustache syntax to extract the URL from metadata.
 
-![public url method](./assets/url.gif)
+![public url method](./assets/url.mp4)
 
 This method streamlines the process when you're working with publicly available PDFs, like the earnings call transcripts we're using in this cookbook.
 

diff --git a/examples/PDFPlayground/assets/add-span-to-dataset.gif b/examples/PDFPlayground/assets/add-span-to-dataset.gif
diff --git a/examples/PDFPlayground/assets/add-span-to-dataset.mp4 b/examples/PDFPlayground/assets/add-span-to-dataset.mp4
diff --git a/examples/PDFPlayground/assets/paperclip.gif b/examples/PDFPlayground/assets/paperclip.gif
diff --git a/examples/PDFPlayground/assets/paperclip.mp4 b/examples/PDFPlayground/assets/paperclip.mp4
diff --git a/examples/PDFPlayground/assets/try-prompt.gif b/examples/PDFPlayground/assets/try-prompt.gif
diff --git a/examples/PDFPlayground/assets/try-prompt.mp4 b/examples/PDFPlayground/assets/try-prompt.mp4
diff --git a/examples/PDFPlayground/assets/url.gif b/examples/PDFPlayground/assets/url.gif
diff --git a/examples/PDFPlayground/assets/url.mp4 b/examples/PDFPlayground/assets/url.mp4
diff --git a/examples/ProviderBenchmark/ProviderBenchmark.ipynb b/examples/ProviderBenchmark/ProviderBenchmark.ipynb
@@ -433,7 +433,7 @@
     "\n",
     "Let's start by looking at the project view. Braintrust makes it easy to morph this into a multi-level grouped analysis where we can see the score vs. duration in a scatter plot, and how each provider stacks up in the table.\n",
     "\n",
-    "![Setting up the table](./assets/configuring-graph.gif)\n",
+    "![Setting up the table](./assets/configuring-graph.mp4)\n",
     "\n",
     "### Insights\n",
     "\n",

diff --git a/examples/ProviderBenchmark/assets/configuring-graph.gif b/examples/ProviderBenchmark/assets/configuring-graph.gif
diff --git a/examples/ProviderBenchmark/assets/configuring-graph.mp4 b/examples/ProviderBenchmark/assets/configuring-graph.mp4
-Original file line number
+Diff line change
@@ Expand Up / @@ -482,7 +482,7 @@ @@
         "\n",
         "Happy evaluating!\n",
         "\n",
-        "![improvements](./assets/improvements.gif)\n"
+        "![improvements](./assets/improvements.mp4)\n"
        ]
       }
      ],
@@ Expand Down @@