|
| 1 | +# Quickstart: Aggregations |
| 2 | + |
| 3 | +This guide is a continuation of the [Intro](./intro.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/intro.py){:target="_blank"}. |
| 4 | + |
| 5 | +In this guide, we will add aggregations to our view to calculate general metrics about the candidates. |
| 6 | + |
| 7 | +## View Definition |
| 8 | + |
| 9 | +To add aggregations to our [structured view](../concepts/structured_views.md), we'll define new methods. These methods will allow the LLM model to perform calculations and summarize data across multiple rows. Let's add three aggregation methods to our `CandidateView`: |
| 10 | + |
| 11 | +```python |
| 12 | +class CandidateView(SqlAlchemyBaseView): |
| 13 | + """ |
| 14 | + A view for retrieving candidates from the database. |
| 15 | + """ |
| 16 | + |
| 17 | + def get_select(self) -> sqlalchemy.Select: |
| 18 | + """ |
| 19 | + Creates the initial SqlAlchemy select object, which will be used to build the query. |
| 20 | + """ |
| 21 | + return sqlalchemy.select(Candidate) |
| 22 | + |
| 23 | + @decorators.view_aggregation() |
| 24 | + def average_years_of_experience(self) -> sqlalchemy.Select: |
| 25 | + """ |
| 26 | + Calculates the average years of experience of candidates. |
| 27 | + """ |
| 28 | + return self.select.with_only_columns( |
| 29 | + sqlalchemy.func.avg(Candidate.years_of_experience).label("average_years_of_experience") |
| 30 | + ) |
| 31 | + |
| 32 | + @decorators.view_aggregation() |
| 33 | + def positions_per_country(self) -> sqlalchemy.Select: |
| 34 | + """ |
| 35 | + Returns the number of candidates per position per country. |
| 36 | + """ |
| 37 | + return ( |
| 38 | + self.select.with_only_columns( |
| 39 | + sqlalchemy.func.count(Candidate.position).label("number_of_positions"), |
| 40 | + Candidate.position, |
| 41 | + Candidate.country, |
| 42 | + ) |
| 43 | + .group_by(Candidate.position, Candidate.country) |
| 44 | + .order_by(sqlalchemy.desc("number_of_positions")) |
| 45 | + ) |
| 46 | + |
| 47 | + @decorators.view_aggregation() |
| 48 | + def candidates_per_country(self) -> sqlalchemy.Select: |
| 49 | + """ |
| 50 | + Returns the number of candidates per country. |
| 51 | + """ |
| 52 | + return ( |
| 53 | + self.select.with_only_columns( |
| 54 | + sqlalchemy.func.count(Candidate.id).label("number_of_candidates"), |
| 55 | + Candidate.country, |
| 56 | + ) |
| 57 | + .group_by(Candidate.country) |
| 58 | + ) |
| 59 | +``` |
| 60 | + |
| 61 | +By setting up these aggregations, you enable the LLM to calculate metrics about the average years of experience, the number of candidates per position per country, and the top universities based on the number of candidates. |
| 62 | + |
| 63 | +## Query Execution |
| 64 | + |
| 65 | +Having already defined and registered the view with the collection, we can now execute the query: |
| 66 | + |
| 67 | +```python |
| 68 | +result = await collection.ask("What is the average years of experience of candidates?") |
| 69 | +print(result.results) |
| 70 | +``` |
| 71 | + |
| 72 | +This will return the average years of experience of candidates. |
| 73 | + |
| 74 | +<details> |
| 75 | + <summary>The expected output</summary> |
| 76 | +``` |
| 77 | +The generated SQL query is: SELECT avg(candidates.years_of_experience) AS average_years_of_experience |
| 78 | +FROM candidates |
| 79 | +
|
| 80 | +Number of rows: 1 |
| 81 | +{'average_years_of_experience': 4.98} |
| 82 | +``` |
| 83 | +</details> |
| 84 | + |
| 85 | +Feel free to try other questions like: "What's the distribution of candidates across different positions and countries?" or "How many candidates are from China?". |
| 86 | + |
| 87 | +## Full Example |
| 88 | + |
| 89 | +Access the full example on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/aggregations.py){:target="_blank"}. |
| 90 | + |
| 91 | +## Next Steps |
| 92 | + |
| 93 | +Explore [Quickstart Part 3: Semantic Similarity](./semantic-similarity.md) to expand on the example and learn about using semantic similarity. |
0 commit comments