Skip to content

Commit

Permalink
configurable table prefix (#971)
Browse files Browse the repository at this point in the history
* prototyping

* nothing

* adjust ini usage

* ini

* Fixing docs.

* satisfy abc for deprecated db

* indent fix

* few more formatting fixes

* more format fixes

* refactor db init in Tru

* move large design decision sections from docstrings to a single md file.

* add design.md and notes

* doc fixes

* cleaning up db implementation docs and some code

* work

* forgot some

* add database_prefix to Tru

* moving and renaming things

* ignores

* fixes from the move/renames

* remove accidental file

* more fixes, and fix reset_database

* print arg parse exception

* streamlit state fixes

* forgot to remove old import

* update database migration notes and copy_database

* move old database to legacy databases folder

* rename class

* typos and doc fixes

* nits

* nits

* testing copy_database

* format

* working on prefix tests

* debugging the foreign keys

* debugging more

* clean up

* typo

* moved on_done_callback content into result() and added some database tests

* undo non-needed

* change alembic logging output

* moved db revision migration check to Tru init

---------

Co-authored-by: Aaron <[email protected]>
  • Loading branch information
piotrm0 and arn-tru authored Apr 17, 2024
1 parent d03fa8b commit 1368970
Show file tree
Hide file tree
Showing 43 changed files with 2,049 additions and 897 deletions.
2 changes: 1 addition & 1 deletion docs/overrides/home.html
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@
<a href="https://go.truera.com/newsletter-archive" target="blank">
<li>Newsletter</li>
</a>
<a href="/trulens_eval/getting_started" target="_blank"
<a href="/trulens_eval/getting_started" target="docs"
class="header__btn nav__btn d-if fd-r ai-c jc-c mt-xxl-mob">
<svg class="hide-desktop" width="24" height="25" viewBox="0 0 24 25" fill="none"
xmlns="http://www.w3.org/2000/svg">
Expand Down
1 change: 1 addition & 0 deletions docs/trulens_eval/api/database/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: trulens_eval.database.base
70 changes: 70 additions & 0 deletions docs/trulens_eval/api/database/migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# 🕸✨ Database Migration

When upgrading _TruLens-Eval_, it may sometimes be required to migrade the
database to incorporate changes in existing database created from the previously
installed version. The changes to database schemas is handled by
[Alembic](https://github.com/sqlalchemy/alembic/) while some data changes are
handled by converters in [the data
module][trulens_eval.database.migrations.data].

## Upgrading to the latest schema revision

```python
from trulens_eval import Tru

tru = Tru(
database_url="<sqlalchemy_url>",
database_prefix="trulens_" # default, may be ommitted
)
tru.migrate_database()
```

## Changing database prefix

Since `0.28.0`, all tables used by _TruLens-Eval_ are prefixed with "trulens_"
including the special `alembic_version` table used for tracking schema changes.
Upgrading to `0.28.0` for the first time will require a migration as specified
above. This migration assumes that the prefix in the existing database was
blank.

If you need to change this prefix after migration, you may need to specify the
old prefix when invoking
[migrate_database][trulens_eval.tru.Tru.migrate_database]:

```python
tru = Tru(
database_url="<sqlalchemy_url>",
database_prefix="new_prefix"
)
tru.migrate_database(prior_prefix="old_prefix")
```

## Copying a database

Have a look at the help text for `copy_database` and take into account all the
items under the section `Important considerations`:

```python
from trulens_eval.database.utils import copy_database

help(copy_database)
```

Copy all data from the source database into an EMPTY target database:

```python
from trulens_eval.database.utils import copy_database

copy_database(
src_url="<source_db_url>",
tgt_url="<target_db_url>",
src_prefix="<source_db_prefix>",
tgt_prefix="<target_db_prefix>"
)
```

::: trulens_eval.tru.Tru.migrate_database

::: trulens_eval.database.utils.copy_database

::: trulens_eval.database.migrations.data
5 changes: 5 additions & 0 deletions docs/trulens_eval/api/database/sqlalchemy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# 🧪 SQLAlchemy Databases

::: trulens_eval.database.sqlalchemy

::: trulens_eval.database.orm
5 changes: 0 additions & 5 deletions docs/trulens_eval/api/db.md

This file was deleted.

4 changes: 4 additions & 0 deletions docs/trulens_eval/contributing/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,10 @@ Parts of the code are nuanced in ways should be avoided by new contributors.
Discussions of these points are welcome to help the project rid itself of these
problematic designs. See [Tech debt index](techdebt.md).

### Database Migration

[Database migration](migration.md).

## 👋👋🏻👋🏼👋🏽👋🏾👋🏿 Contributors

{%
Expand Down
Original file line number Diff line number Diff line change
@@ -1,37 +1,33 @@
# Database Migrations
Database schema revisions are handled with
[Alembic](https://github.com/sqlalchemy/alembic/)
# ✨ Database Migration

## Upgrading to the latest schema revision

```python
from trulens_eval import Tru

tru = Tru(database_url="<sqlalchemy_url>")
tru.migrate_database()
```
These notes only apply to _trulens_eval_ developments that change the database
schema.

Warning:
Some of these instructions may be outdated and are in progress if being updated.

## Creating a new schema revision

If upgrading DB, You must do this step!!

1. `cd truera/trulens_eval/database/migrations`
1. Make sure you have an existing database at the latest schema
* `mv
trulens/trulens_eval/release_dbs/sql_alchemy_<LATEST_VERSION>/default.sqlite`
./
1. Edit the [SQLAlchemy models](../orm.py)
1. Edit the SQLAlchemy orm models in `trulens_eval/database/orm.py`.
1. Run `export SQLALCHEMY_URL="<url>" && alembic revision --autogenerate -m
"<short_description>" --rev-id "<next_integer_version>"`
1. Look at the migration script generated at [versions](./versions) and edit if
1. Look at the migration script generated at `trulens_eval/database/migration/versions` and edit if
necessary
1. Add the version to `db_data_migration.py` in variable:
1. Add the version to `database/migration/data.py` in variable:
`sql_alchemy_migration_versions`
1. Make any `data_migrate` updates in `db_data_migration.py` if python changes
1. Make any `data_migrate` updates in `database/migration/data.py` if python changes
were made
1. `git add truera/trulens_eval/database/migrations/versions`

## Creating a DB at the latest schema

If upgrading DB, You must do this step!!

Note: You must create a new schema revision before doing this
Expand All @@ -58,31 +54,18 @@ Note: You must create a new schema revision before doing this
1. `git add trulens/trulens_eval/release_dbs`

## Testing the DB
Run the below:
1. `cd trulens/trulens_eval`
1. `HUGGINGFACE_API_KEY="<to_fill_out>" OPENAI_API_KEY="<to_fill_out>"
PINECONE_API_KEY="" PINECONE_ENV="" HUGGINGFACEHUB_API_TOKEN="" python -m
pytest tests/docs_notebooks -k backwards_compat`

## Copying a database
Have a look at the help text for `_copy_database` and take into account all the
items under the section `Important considerations`:

```python

from trulens_eval.database.utils import _copy_database

help(_copy_database)
```

Copy all data from the source database into an EMPTY target database:
Run the below:

```python
1. `cd trulens/trulens_eval`

from trulens_eval.database.utils import _copy_database
2. Run the tests with the requisite env vars.

_copy_database(
src_url="<source_db_url>",
tgt_url="<target_db_url>"
)
```
```bash
HUGGINGFACE_API_KEY="<to_fill_out>" \
OPENAI_API_KEY="<to_fill_out>" \
PINECONE_API_KEY="<to_fill_out>" \
PINECONE_ENV="<to_fill_out>" \
HUGGINGFACEHUB_API_TOKEN="<to_fill_out>" \
python -m pytest tests/docs_notebooks -k backwards_compat
```
7 changes: 6 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ plugins:
- https://docs.pydantic.dev/latest/objects.inv
- https://typing-extensions.readthedocs.io/en/latest/objects.inv
- https://docs.llamaindex.ai/en/stable/objects.inv
- https://docs.sqlalchemy.org/en/20/objects.inv
options:
extensions:
- pydantic: { schema: true }
Expand Down Expand Up @@ -257,7 +258,10 @@ nav:
- trulens_eval/api/endpoint/index.md
- OpenAI: trulens_eval/api/endpoint/openai.md
- 𝄢 Instruments: trulens_eval/api/instruments.md
- 🗄 Database: trulens_eval/api/db.md
- 🗄 Database:
- trulens_eval/api/database/index.md
- ✨ Migration: trulens_eval/api/database/migration.md
- 🧪 SQLAlchemy: trulens_eval/api/database/sqlalchemy.md
- Utils:
# - trulens_eval/api/utils/index.md
- trulens_eval/api/utils/python.md
Expand All @@ -270,6 +274,7 @@ nav:
- 🧭 Design: trulens_eval/contributing/design.md
- ✅ Standards: trulens_eval/contributing/standards.md
- 💣 Tech Debt: trulens_eval/contributing/techdebt.md
- ✨ Database Migration: trulens_eval/contributing/migration.md
- ❓ Explain:
# PLACEHOLDER: - trulens_explain/index.md
- Getting Started:
Expand Down
4 changes: 4 additions & 0 deletions trulens_eval/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ test-database:
$(CONDA); python -m unittest discover tests.integration.test_database
docker compose --file docker/test-database.yaml down

# These tests all operate on local file databases and don't require docker.
test-database-specification:
$(CONDA); python -m unittest discover tests.integration.test_database -k TestDBSpecifications

# The next 3 database migration/versioning tests:
test-database-versioning: test-database-v2migration test-database-legacy-migration test-database-future

Expand Down
4 changes: 2 additions & 2 deletions trulens_eval/docker/test-database.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Docker compose environment setup for running
# integration tests of `trulens_eval.db_v2`
# Use with `make test-it-db-v2`
# integration tests for `trulens_eval.database.sqlalchemy`
# Use with `make test-database`.

version: "3.9"

Expand Down
2 changes: 2 additions & 0 deletions trulens_eval/examples/experimental/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
default.sqlite
paul_graham_essay.txt
Binary file removed trulens_eval/examples/experimental/default.sqlite
Binary file not shown.
91 changes: 69 additions & 22 deletions trulens_eval/examples/experimental/dev_notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,17 @@
"while not (base / \"trulens_eval\").exists():\n",
" base = base.parent\n",
"\n",
"\n",
"import os\n",
"if os.path.exists(\"default.sqlite\"):\n",
" os.unlink(\"default.sqlite\")\n",
"\n",
"print(base)\n",
"\n",
"import shutil\n",
"shutil.copy(base / \"release_dbs\" / \"0.19.0\" / \"default.sqlite\", \"default.sqlite\")\n",
"\n",
"\n",
"# If running from github repo, can use this:\n",
"sys.path.append(str(base))\n",
"\n",
Expand All @@ -55,10 +64,57 @@
")\n",
"\n",
"from trulens_eval import Tru\n",
"tru = Tru()\n",
"tru.reset_database()\n",
"\n",
"tru.run_dashboard(_dev=base, force=True)"
"tru = Tru(database_prefix=\"dev\")\n",
"#tru.reset_database()\n",
"# tru.run_dashboard(_dev=base, force=True)\n",
"# tru.db.migrate_database()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# tru.db.migrate_database()\n",
"tru.migrate_database()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for t in tru.db.orm.registry.values():\n",
" print(t)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from trulens_eval.database.utils import copy_database"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tru.db"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"copy_database(\"sqlite:///default.sqlite\", \"sqlite:///default2.sqlite\", src_prefix=\"dev\", tgt_prefix=\"dev\")"
]
},
{
Expand Down Expand Up @@ -96,20 +152,13 @@
"source": [
"from trulens_eval.feedback.provider.hugs import Dummy\n",
"from trulens_eval import Select\n",
"from trulens_eval.app import App\n",
"from trulens_eval.feedback.feedback import Feedback\n",
"\n",
"f = Feedback(Dummy().language_match, if_missing=\"ignore\").on(Select.RecordCalls._retriever.retrieve.rets[42])\n",
"f = Feedback(Dummy().language_match).on_input().on(\n",
" App.select_context(query_engine))\n",
"\n",
"tru_query_engine_recorder = TruLlama(query_engine, feedbacks=[f])\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Tru().db.get_feedback()"
"tru_query_engine_recorder = TruLlama(query_engine, feedbacks=[f])"
]
},
{
Expand All @@ -118,7 +167,10 @@
"metadata": {},
"outputs": [],
"source": [
"recs"
"llm_response, record = tru_query_engine_recorder.with_record(\n",
" query_engine.query, \"What did the author do growing up?\"\n",
")\n",
"record"
]
},
{
Expand All @@ -127,12 +179,7 @@
"metadata": {},
"outputs": [],
"source": [
"from trulens_eval.utils.asynchro import sync\n",
"\n",
"llm_response_async, record_async = sync(tru_query_engine_recorder.awith_record,\n",
" query_engine.aquery, \"What did the author do growing up?\"\n",
")\n",
"record_async"
"tru.run_dashboard(_dev=base, force=True)"
]
},
{
Expand Down
2 changes: 2 additions & 0 deletions trulens_eval/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@
)
)


class BuildJavascript(build):

def run(self):
"""Custom build command to run npm commands before building the package.
Expand Down
Loading

0 comments on commit 1368970

Please sign in to comment.