Skip to content

Commit 917bab5

Browse files
authored
feat: add support for langchain (#94)
1 parent a877b0c commit 917bab5

17 files changed

+793
-225
lines changed
Loading
Loading
98.1 KB
Loading
Loading

docs/integrations/langchain.ipynb

+401
Large diffs are not rendered by default.

docs/integrations/langsmith.ipynb

+162-162
Original file line numberDiff line numberDiff line change
@@ -1,165 +1,165 @@
11
{
2-
"cells": [
3-
{
4-
"cell_type": "markdown",
5-
"id": "a0b3171b",
6-
"metadata": {},
7-
"source": [
8-
"# Langsmith Integrations\n",
9-
"\n",
10-
"[Langsmith](https://docs.smith.langchain.com/) in a platform for building production-grade LLM applications from the langchain team. It helps you with tracing, debugging and evaluting LLM applications.\n",
11-
"\n",
12-
"The langsmith + ragas integrations offer 2 features\n",
13-
"1. View the traces of ragas `evaluator` \n",
14-
"2. Use ragas metrics in langchain evaluation - (soon)\n",
15-
"\n",
16-
"\n",
17-
"### Tracing ragas metrics\n",
18-
"\n",
19-
"since ragas uses langchain under the hood all you have to do is setup langsmith and your traces will be logged.\n",
20-
"\n",
21-
"to setup langsmith make sure the following env-vars are set (you can read more in the [langsmith docs](https://docs.smith.langchain.com/#quick-start)\n",
22-
"\n",
23-
"```bash\n",
24-
"export LANGCHAIN_TRACING_V2=true\n",
25-
"export LANGCHAIN_ENDPOINT=https://api.smith.langchain.com\n",
26-
"export LANGCHAIN_API_KEY=<your-api-key>\n",
27-
"export LANGCHAIN_PROJECT=<your-project> # if not specified, defaults to \"default\"\n",
28-
"```\n",
29-
"\n",
30-
"Once langsmith is setup, just run the evaluations as your normally would"
31-
]
32-
},
33-
{
34-
"cell_type": "code",
35-
"execution_count": 1,
36-
"id": "39375103",
37-
"metadata": {},
38-
"outputs": [
39-
{
40-
"name": "stderr",
41-
"output_type": "stream",
42-
"text": [
43-
"Found cached dataset fiqa (/home/jjmachan/.cache/huggingface/datasets/explodinggradients___fiqa/ragas_eval/1.0.0/3dc7b639f5b4b16509a3299a2ceb78bf5fe98ee6b5fee25e7d5e4d290c88efb8)\n"
44-
]
45-
},
46-
{
47-
"data": {
48-
"application/vnd.jupyter.widget-view+json": {
49-
"model_id": "dc5a62b3aebb45d690d9f0dcc783deea",
50-
"version_major": 2,
51-
"version_minor": 0
52-
},
53-
"text/plain": [
54-
" 0%| | 0/1 [00:00<?, ?it/s]"
55-
]
56-
},
57-
"metadata": {},
58-
"output_type": "display_data"
59-
},
60-
{
61-
"name": "stdout",
62-
"output_type": "stream",
63-
"text": [
64-
"evaluating with [context_ relevancy]\n"
65-
]
66-
},
67-
{
68-
"name": "stderr",
69-
"output_type": "stream",
70-
"text": [
71-
"100%|████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.90s/it]\n"
72-
]
73-
},
74-
{
75-
"name": "stdout",
76-
"output_type": "stream",
77-
"text": [
78-
"evaluating with [faithfulness]\n"
79-
]
80-
},
81-
{
82-
"name": "stderr",
83-
"output_type": "stream",
84-
"text": [
85-
"100%|████████████████████████████████████████████████████████████| 1/1 [00:21<00:00, 21.01s/it]\n"
86-
]
87-
},
88-
{
89-
"name": "stdout",
90-
"output_type": "stream",
91-
"text": [
92-
"evaluating with [answer_relevancy]\n"
93-
]
94-
},
95-
{
96-
"name": "stderr",
97-
"output_type": "stream",
98-
"text": [
99-
"100%|████████████████████████████████████████████████████████████| 1/1 [00:07<00:00, 7.36s/it]\n"
100-
]
101-
},
102-
{
103-
"data": {
104-
"text/plain": [
105-
"{'ragas_score': 0.1837, 'context_ relevancy': 0.0707, 'faithfulness': 0.8889, 'answer_relevancy': 0.9403}"
106-
]
107-
},
108-
"execution_count": 1,
109-
"metadata": {},
110-
"output_type": "execute_result"
111-
}
112-
],
113-
"source": [
114-
"from datasets import load_dataset\n",
115-
"from ragas.metrics import context_relevancy, answer_relevancy, faithfulness\n",
116-
"from ragas import evaluate\n",
117-
"\n",
118-
"\n",
119-
"fiqa_eval = load_dataset(\"explodinggradients/fiqa\", \"ragas_eval\")\n",
120-
"\n",
121-
"result = evaluate(\n",
122-
" fiqa_eval[\"baseline\"].select(range(3)), \n",
123-
" metrics=[context_relevancy, faithfulness, answer_relevancy]\n",
124-
")\n",
125-
"\n",
126-
"result"
127-
]
128-
},
129-
{
130-
"cell_type": "markdown",
131-
"id": "8ce1c649",
132-
"metadata": {},
133-
"source": [
134-
"Voila! Now you can head over to your project and see the traces\n",
135-
"\n",
136-
"![](../assets/langsmith-tracing-overview.png)\n",
137-
"this shows the langsmith tracing dashboard overview\n",
138-
"\n",
139-
"![](../assets/langsmith-tracing-faithfullness.png)\n",
140-
"this shows the traces for the faithfullness metrics. As you can see being able to view the reasons why the metric gives the score is helpful in figuring out how to improving it."
141-
]
142-
}
143-
],
144-
"metadata": {
145-
"kernelspec": {
146-
"display_name": "Python 3 (ipykernel)",
147-
"language": "python",
148-
"name": "python3"
149-
},
150-
"language_info": {
151-
"codemirror_mode": {
152-
"name": "ipython",
153-
"version": 3
154-
},
155-
"file_extension": ".py",
156-
"mimetype": "text/x-python",
157-
"name": "python",
158-
"nbconvert_exporter": "python",
159-
"pygments_lexer": "ipython3",
160-
"version": "3.10.12"
161-
}
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "a0b3171b",
6+
"metadata": {},
7+
"source": [
8+
"# Langsmith Integrations\n",
9+
"\n",
10+
"[Langsmith](https://docs.smith.langchain.com/) in a platform for building production-grade LLM applications from the langchain team. It helps you with tracing, debugging and evaluting LLM applications.\n",
11+
"\n",
12+
"The langsmith + ragas integrations offer 2 features\n",
13+
"1. View the traces of ragas `evaluator` \n",
14+
"2. Use ragas metrics in langchain evaluation - (soon)\n",
15+
"\n",
16+
"\n",
17+
"### Tracing ragas metrics\n",
18+
"\n",
19+
"since ragas uses langchain under the hood all you have to do is setup langsmith and your traces will be logged.\n",
20+
"\n",
21+
"to setup langsmith make sure the following env-vars are set (you can read more in the [langsmith docs](https://docs.smith.langchain.com/#quick-start)\n",
22+
"\n",
23+
"```bash\n",
24+
"export LANGCHAIN_TRACING_V2=true\n",
25+
"export LANGCHAIN_ENDPOINT=https://api.smith.langchain.com\n",
26+
"export LANGCHAIN_API_KEY=<your-api-key>\n",
27+
"export LANGCHAIN_PROJECT=<your-project> # if not specified, defaults to \"default\"\n",
28+
"```\n",
29+
"\n",
30+
"Once langsmith is setup, just run the evaluations as your normally would"
31+
]
32+
},
33+
{
34+
"cell_type": "code",
35+
"execution_count": 1,
36+
"id": "39375103",
37+
"metadata": {},
38+
"outputs": [
39+
{
40+
"name": "stderr",
41+
"output_type": "stream",
42+
"text": [
43+
"Found cached dataset fiqa (/home/jjmachan/.cache/huggingface/datasets/explodinggradients___fiqa/ragas_eval/1.0.0/3dc7b639f5b4b16509a3299a2ceb78bf5fe98ee6b5fee25e7d5e4d290c88efb8)\n"
44+
]
16245
},
163-
"nbformat": 4,
164-
"nbformat_minor": 5
46+
{
47+
"data": {
48+
"application/vnd.jupyter.widget-view+json": {
49+
"model_id": "dc5a62b3aebb45d690d9f0dcc783deea",
50+
"version_major": 2,
51+
"version_minor": 0
52+
},
53+
"text/plain": [
54+
" 0%| | 0/1 [00:00<?, ?it/s]"
55+
]
56+
},
57+
"metadata": {},
58+
"output_type": "display_data"
59+
},
60+
{
61+
"name": "stdout",
62+
"output_type": "stream",
63+
"text": [
64+
"evaluating with [context_ relevancy]\n"
65+
]
66+
},
67+
{
68+
"name": "stderr",
69+
"output_type": "stream",
70+
"text": [
71+
"100%|████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.90s/it]\n"
72+
]
73+
},
74+
{
75+
"name": "stdout",
76+
"output_type": "stream",
77+
"text": [
78+
"evaluating with [faithfulness]\n"
79+
]
80+
},
81+
{
82+
"name": "stderr",
83+
"output_type": "stream",
84+
"text": [
85+
"100%|████████████████████████████████████████████████████████████| 1/1 [00:21<00:00, 21.01s/it]\n"
86+
]
87+
},
88+
{
89+
"name": "stdout",
90+
"output_type": "stream",
91+
"text": [
92+
"evaluating with [answer_relevancy]\n"
93+
]
94+
},
95+
{
96+
"name": "stderr",
97+
"output_type": "stream",
98+
"text": [
99+
"100%|████████████████████████████████████████████████████████████| 1/1 [00:07<00:00, 7.36s/it]\n"
100+
]
101+
},
102+
{
103+
"data": {
104+
"text/plain": [
105+
"{'ragas_score': 0.1837, 'context_ relevancy': 0.0707, 'faithfulness': 0.8889, 'answer_relevancy': 0.9403}"
106+
]
107+
},
108+
"execution_count": 1,
109+
"metadata": {},
110+
"output_type": "execute_result"
111+
}
112+
],
113+
"source": [
114+
"from datasets import load_dataset\n",
115+
"from ragas.metrics import context_relevancy, answer_relevancy, faithfulness\n",
116+
"from ragas import evaluate\n",
117+
"\n",
118+
"\n",
119+
"fiqa_eval = load_dataset(\"explodinggradients/fiqa\", \"ragas_eval\")\n",
120+
"\n",
121+
"result = evaluate(\n",
122+
" fiqa_eval[\"baseline\"].select(range(3)),\n",
123+
" metrics=[context_relevancy, faithfulness, answer_relevancy],\n",
124+
")\n",
125+
"\n",
126+
"result"
127+
]
128+
},
129+
{
130+
"cell_type": "markdown",
131+
"id": "8ce1c649",
132+
"metadata": {},
133+
"source": [
134+
"Voila! Now you can head over to your project and see the traces\n",
135+
"\n",
136+
"![](../assets/langsmith-tracing-overview.png)\n",
137+
"this shows the langsmith tracing dashboard overview\n",
138+
"\n",
139+
"![](../assets/langsmith-tracing-faithfullness.png)\n",
140+
"this shows the traces for the faithfullness metrics. As you can see being able to view the reasons why the metric gives the score is helpful in figuring out how to improving it."
141+
]
142+
}
143+
],
144+
"metadata": {
145+
"kernelspec": {
146+
"display_name": "Python 3 (ipykernel)",
147+
"language": "python",
148+
"name": "python3"
149+
},
150+
"language_info": {
151+
"codemirror_mode": {
152+
"name": "ipython",
153+
"version": 3
154+
},
155+
"file_extension": ".py",
156+
"mimetype": "text/x-python",
157+
"name": "python",
158+
"nbconvert_exporter": "python",
159+
"pygments_lexer": "ipython3",
160+
"version": "3.10.12"
161+
}
162+
},
163+
"nbformat": 4,
164+
"nbformat_minor": 5
165165
}

src/ragas/async_utils.py

+21-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
"""Async utils."""
22
import asyncio
3-
from typing import Any, Coroutine, List
3+
from itertools import zip_longest
4+
from typing import Any, Coroutine, Iterable, List
45

56

67
def run_async_tasks(
@@ -29,6 +30,8 @@ async def _tqdm_gather() -> List[Any]:
2930
# run the operation w/o tqdm on hitting a fatal
3031
# may occur in some environments where tqdm.asyncio
3132
# is not supported
33+
except ImportError as e:
34+
print(e)
3235
except Exception:
3336
pass
3437

@@ -37,3 +40,20 @@ async def _gather() -> List[Any]:
3740

3841
outputs: List[Any] = asyncio.run(_gather())
3942
return outputs
43+
44+
45+
def chunks(iterable: Iterable, size: int) -> Iterable:
46+
args = [iter(iterable)] * size
47+
return zip_longest(*args, fillvalue=None)
48+
49+
50+
async def batch_gather(
51+
tasks: List[Coroutine], batch_size: int = 10, verbose: bool = False
52+
) -> List[Any]:
53+
output: List[Any] = []
54+
for task_chunk in chunks(tasks, batch_size):
55+
output_chunk = await asyncio.gather(*task_chunk)
56+
output.extend(output_chunk)
57+
if verbose:
58+
print(f"Completed {len(output)} out of {len(tasks)} tasks")
59+
return output

src/ragas/langchain/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)