Get started

To get started:

create .env with OPENAI_API_KEY

yarn install

node evals.js

The eval in eals/eval-001 will be run ten times. The results will be saved to ./output.

Eval structure

Each eval contains:

Using integration tests, you can automate the testing of your agent or run Monte Carlo simulations.

AgentEval.Demo.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
agents		agents
cypress		cypress
evals/eval-001		evals/eval-001
.gitignore		.gitignore
README.md		README.md
cypress.config.js		cypress.config.js
evals.js		evals.js
package.json		package.json
yarn.lock		yarn.lock