Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions examples/APIAgent-Py/APIAgent.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -826,7 +826,7 @@
"source": [
"Awesome! The logs now have a `no_hallucination` score which we can use to filter down hallucinations.\n",
"\n",
"![Hallucination logs](./assets/logs-with-score.gif)\n"
"![Hallucination logs](./assets/logs-with-score.mp4)\n"
]
},
{
Expand All @@ -839,7 +839,7 @@
"non-hallucinations are correct, but in a real-world scenario, you could [collect user feedback](https://www.braintrust.dev/docs/guides/logging#user-feedback)\n",
"and treat positively rated feedback as ground truth.\n",
"\n",
"![Dataset setup](./assets/dataset-setup.gif)\n",
"![Dataset setup](./assets/dataset-setup.mp4)\n",
"\n",
"## Running evals\n",
"\n",
Expand Down Expand Up @@ -1020,7 +1020,7 @@
"\n",
"To understand why, we can filter down to this regression, and take a look at a side-by-side diff.\n",
"\n",
"![Regression diff](./assets/regression-diff.gif)\n",
"![Regression diff](./assets/regression-diff.mp4)\n",
"\n",
"Does it matter whether or not the model generates these fields? That's a good question and something you can work on as a next step.\n",
"Maybe you should tweak how Factuality works, or change the prompt to always return a consistent set of fields.\n",
Expand Down
Binary file removed examples/APIAgent-Py/assets/dataset-setup.gif
Binary file not shown.
Binary file added examples/APIAgent-Py/assets/dataset-setup.mp4
Binary file not shown.
Binary file removed examples/APIAgent-Py/assets/logs-with-score.gif
Binary file not shown.
Binary file added examples/APIAgent-Py/assets/logs-with-score.mp4
Binary file not shown.
Binary file removed examples/APIAgent-Py/assets/regression-diff.gif
Binary file not shown.
Binary file added examples/APIAgent-Py/assets/regression-diff.mp4
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -423,7 +423,7 @@
"- You should see the eval scores increase and you can see which test cases improved.\n",
"- You can also filter the test cases by improvements to know exactly why the scores changed.\n",
"\n",
"![Compare](assets/inspect.gif)\n",
"![Compare](assets/inspect.mp4)\n",
"\n"
]
},
Expand Down
Binary file removed examples/ClassifyingNewsArticles/assets/inspect.gif
Binary file not shown.
Binary file not shown.
2 changes: 1 addition & 1 deletion examples/Github-Issues/Github-Issues.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -482,7 +482,7 @@
"\n",
"Happy evaluating!\n",
"\n",
"![improvements](./assets/improvements.gif)\n"
"![improvements](./assets/improvements.mp4)\n"
]
}
],
Expand Down
Binary file removed examples/Github-Issues/assets/improvements.gif
Binary file not shown.
Binary file added examples/Github-Issues/assets/improvements.mp4
Binary file not shown.
2 changes: 1 addition & 1 deletion examples/LLaMa-3_1-Tools/LLaMa-3_1-Tools.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -756,7 +756,7 @@
"\n",
"Although it's a fraction of the cost, it's both slower (likely due to rate limits) and worse performing than GPT-4o. 12 of the 60 cases failed to parse. Let's take a look at one of those in depth.\n",
"\n",
"![parsing-failure](./assets/parsing-failure.gif)\n",
"![parsing-failure](./assets/parsing-failure.mp4)\n",
"\n",
"That definitely looks like an invalid tool call. Maybe we can experiment with tweaking the prompt to get better results.\n",
"\n",
Expand Down
Binary file removed examples/LLaMa-3_1-Tools/assets/parsing-failure.gif
Binary file not shown.
Binary file not shown.
Binary file removed examples/OTEL-logging/assets/add-post-filter.gif
Binary file not shown.
Binary file added examples/OTEL-logging/assets/add-post-filter.mp4
Binary file not shown.
Binary file removed examples/OTEL-logging/assets/otel-demo.gif
Binary file not shown.
Binary file added examples/OTEL-logging/assets/otel-demo.mp4
Binary file not shown.
Binary file removed examples/OTEL-logging/assets/spans.gif
Binary file not shown.
Binary file added examples/OTEL-logging/assets/spans.mp4
Binary file not shown.
6 changes: 3 additions & 3 deletions examples/OTEL-logging/otel-logging.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -141,19 +141,19 @@ Run `npm install` to install the required dependencies, then `npm run dev` to la

Open your Braintrust project to the **Logs** page, and select **What orders have shipped?** in your applications. You should be able to watch the logs filter in as your application makes HTTP requests and LLM calls.

![LLM calls and logs side by side](assets/otel-demo.gif)
![LLM calls and logs side by side](assets/otel-demo.mp4)

Because this application is using multi-step streaming and tool calls, the logs are especially interesting. In Braintrust, logs consist of [traces](/docs/guides/traces), which roughly correspond to a single request or interaction in your application. Traces consist of one or more spans, each of which corresponds to a unit of work in your application. In this example, each step and tool call is logged inside of its own span. This level of granularity makes it easier to debug issues, track user behavior, and collect data into datasets.

### Filtering your logs

Run a couple more queries in the app and notice the logs that are generated. Our app is logging both `GET` and `POST` requests, but we’re most interested in the `POST` requests since they contain our LLM calls. We can apply a filter using the [BTQL](/docs/reference/btql) query `Name LIKE 'POST%'` so that we only see the traces we care about:

![Filter using BTQL](assets/add-post-filter.gif)
![Filter using BTQL](assets/add-post-filter.mp4)

You should now have a list of traces for all the `POST` requests your app has made. Each contains the inputs and outputs of each LLM call in a span called `ai.streamText`. If you go further into the trace, you’ll also notice a span for each tool call.

![Expanding tool call and stream spans](assets/spans.gif)
![Expanding tool call and stream spans](assets/spans.mp4)

This is valuable data that can be used to evaluate the quality of accuracy of your application in Braintrust.

Expand Down
8 changes: 4 additions & 4 deletions examples/PDFPlayground/PDFPlayground.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -348,7 +348,7 @@ Once your traces have been logged, you can use the Braintrust UI to manage your

You can store the user spans from your PDF traces into a dataset. Select the span, and then select **Add span to dataset**, or use the hotkey `D` to speed this up.

![add span to dataset](./assets/add-span-to-dataset.gif)
![add span to dataset](./assets/add-span-to-dataset.mp4)

### Trying system prompts in a playground

Expand All @@ -357,21 +357,21 @@ Select a system prompt span, and then select **Try prompt** to:
1. Save the prompt (for example, "system1") to your library by selecting **Save as custom prompt**
2. Launch a playground using the saved prompt by selecting **Create playground with prompt**

![try prompt from span](./assets/try-prompt.gif)
![try prompt from span](./assets/try-prompt.mp4)

### File attachment methods

There are two ways to attach PDF files in playgrounds: using the paperclip button in the UI, or specifying a public URL. Let's walk through each method:

- To upload files directly from your local machine, start by selecting **+ Message** to add a user prompt. Then, select **+ Message Part** > **File**. This will display a paperclip icon on the right side. Select it to upload a file from your local machine.

![paperclip UI method](./assets/paperclip.gif)
![paperclip UI method](./assets/paperclip.mp4)

This method is particularly useful when you're working with local files that aren't accessible via public URL.

- To use the public URL method, paste the URL directly into the file message input field. You can also use mustache syntax to extract the URL from metadata.

![public url method](./assets/url.gif)
![public url method](./assets/url.mp4)

This method streamlines the process when you're working with publicly available PDFs, like the earnings call transcripts we're using in this cookbook.

Expand Down
Binary file not shown.
Binary file not shown.
Binary file removed examples/PDFPlayground/assets/paperclip.gif
Binary file not shown.
Binary file added examples/PDFPlayground/assets/paperclip.mp4
Binary file not shown.
Binary file removed examples/PDFPlayground/assets/try-prompt.gif
Binary file not shown.
Binary file added examples/PDFPlayground/assets/try-prompt.mp4
Binary file not shown.
Binary file removed examples/PDFPlayground/assets/url.gif
Binary file not shown.
Binary file added examples/PDFPlayground/assets/url.mp4
Binary file not shown.
2 changes: 1 addition & 1 deletion examples/ProviderBenchmark/ProviderBenchmark.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,7 @@
"\n",
"Let's start by looking at the project view. Braintrust makes it easy to morph this into a multi-level grouped analysis where we can see the score vs. duration in a scatter plot, and how each provider stacks up in the table.\n",
"\n",
"![Setting up the table](./assets/configuring-graph.gif)\n",
"![Setting up the table](./assets/configuring-graph.mp4)\n",
"\n",
"### Insights\n",
"\n",
Expand Down
Binary file not shown.
Binary file not shown.
Loading