Conversation
|
|
||
| // indicate the project name the tests will be sent to | ||
| const bt = wrapVitest( | ||
| { test, expect, describe, afterAll }, |
There was a problem hiding this comment.
I'd recommend for users to:
import * as vitest from 'vitest';
const { describe, expect, test, ... } = wrapVitest(vitest);
simpler, less prone to error, and future proof if we add more functions to our support
| metadata: { category: "math" }, | ||
| tags: ["arithmetic"], | ||
| }, | ||
| async ({ input, expected }) => { |
There was a problem hiding this comment.
til.. so the context is re-inserted as an argument
| @@ -0,0 +1,46 @@ | |||
| import tsconfigPaths from "vite-tsconfig-paths"; | |||
There was a problem hiding this comment.
didn't expect to see this file in here 🤔
js/src/wrappers/vitest/wrapper.ts
Outdated
| return originalDescribe(suiteName, () => { | ||
| // Lazily initialize experiment context on first access | ||
| let context: ExperimentContext | null = null; | ||
| const getOrCreateContext = (): ExperimentContext => { |
There was a problem hiding this comment.
like we want to extract and use the same getOrCreate inside of test() calls
| "Braintrust: vitestMethods.describe is required. Please pass in the describe function from vitest.", | ||
| ); | ||
| } | ||
| if (!vitestMethods.expect) { |
There was a problem hiding this comment.
i didn't see a wrapExpect in wrapper.ts. I wonder if each expect is a scorer?
js/src/wrappers/vitest/wrapper.ts
Outdated
| }); | ||
|
|
||
| // If test function returns a value, log it as output | ||
| if (testResult !== undefined) { |
There was a problem hiding this comment.
i think if you traced(maybeFn || configOrFn, ...) you may have gotten this automatically?
js/src/wrappers/vitest/wrapper.ts
Outdated
| scores: { | ||
| pass: 0, | ||
| }, | ||
| metadata: { |
There was a problem hiding this comment.
you should probably just throw again. the traced() call should handle the error.
js/src/wrappers/vitest/wrapper.ts
Outdated
| datasetExamples: Map<string, string>; // test name -> example id | ||
| } | ||
|
|
||
| // Global context holder (one per describe block) |
There was a problem hiding this comment.
what happens with concurrent calls i.e.
it.concurrent(
describe(..., () => {
})
);
it.concurrent(
describe(..., () => {
})
);
did you give currentExperiment a try?
There was a problem hiding this comment.
I modified how the context is created for experiments. I added some additional tests for the concurrent experiments.
remove dataset creation Add tests for concurrent tests
3300ebb to
b2d3406
Compare
ibolmo
left a comment
There was a problem hiding this comment.
small suggestions not blocking
| async ({ input, expected }) => { | ||
| const result = 4; | ||
| logOutputs({ answer: result }); | ||
| expect(result).toBe(expected); |
There was a problem hiding this comment.
ah yes, could we try multiple expect() calls? I wonder how that would look in the bt logs.
what does the custom message look like in braintrust? i.e.
expect(result, 'equality').toBe(expected)
js/src/wrappers/vitest/index.ts
Outdated
| return { | ||
| test: wrappedTest, | ||
| it: wrappedTest, | ||
| expect: vitestMethods.expect, |
There was a problem hiding this comment.
ah so we don't wrap expects.
like i mentioned earlier, it would be interesting to use the expect(..., 'message')... and the message is the index for the output, or maybe we have a counter/stack that we push expect(output) to then the output might be something like
expect(0).toBe(0);
expect('something', 'message').toBe('something');
expect('foo').toBe('bar');then the event could be
{
...
output: {
0: output,
'message': output,
1: output,
},
scores: {
0: 1,
'message': 1,
1: 0
}
}
}There was a problem hiding this comment.
ah yes I didn't wrap the expect method just the test methods. I was thinking people cared about getting the scoring which does add its output to individual logs. Thinking about this flow a bit.
There was a problem hiding this comment.
did a compromise and logged named outputs automatically. Users can still log additional outputs as necessary. Will see what the feedback is like.
| pnpm dlx tsx openai.ts | ||
| ``` | ||
|
|
||
| ### Vitest Golden Tests |
| * expected: 'hola', | ||
| * metadata: { language: 'spanish' }, | ||
| * }, | ||
| * async ({ input, expected }) => { |
There was a problem hiding this comment.
i wonder if we need the trace() method 🤔
js/src/wrappers/vitest/index.ts
Outdated
| * | ||
| * bt.describe('Translation Tests', () => { | ||
| * bt.afterAll(async () => { | ||
| * await bt.flushExperiment(); // Flushes and displays experiment summary |
There was a problem hiding this comment.
i'd expect this to be done automatically 🤔. looking at their docs it doesn't seem like they require them to be explicit
There was a problem hiding this comment.
This is done automatically. The method is exposed for access but doesn't need to be called. I thought I removed it from the examples and tests, might have missed this one.
69f9e7a to
bfe8287
Compare
Add ability to create experiments writing tests with vitest
Use existing braintrust datasets or pass their own data to test
Use existing scorers or pass their own custom scorer to tests
Each describe will create an experiment and the tests inside will each be their own span in the experiment.
See golden-ts-vitest-experiment-v* projects for some examples
Example: