Skip to content

Conversation

@ibolmo
Copy link
Contributor

@ibolmo ibolmo commented Jul 7, 2025

No description provided.

@ibolmo ibolmo self-assigned this Jul 7, 2025
@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Braintrust eval report

Say Hi Bot Python (add-package-manager-1751934197)

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751934197.14s (+0s) - -
End 1751934197.15s (+0s) - -
Duration 0s (+0s) 1 🟢 1 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Braintrust eval report

Say Hi Bot (add-package-manager-1751934202)

Score Average Improvements Regressions
Levenshtein 85.3% (+1pp) 7 🟢 6 🔴
Start 1751934202.27s - -
End 1751934203.27s - -
Duration 1s (0s) 7 🟢 2 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Braintrust eval report

Console logging (add-package-manager-1751934204)

Score Average Improvements Regressions
Levenshtein 82.6% (+1pp) 5 🟢 4 🔴
Start 1751934204.32s - -
End 1751934204.33s - -
Duration 0s (+0s) 2 🟢 13 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

My Evaluation (add-package-manager-1751934204)

Score Average Improvements Regressions
Exact match 100% (+0pp) - -
Start 1751934204.37s - -
End 1751934204.38s - -
Duration 0.02s (-0.51s) 1 🟢 -
Prompt_tokens 10tok (+0tok) - -
Completion_tokens 2tok (+0tok) - -
Total_tokens 12tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

Say Hi Bot (add-package-manager-1751934204)

Score Average Improvements Regressions
Levenshtein 83.8% (-2pp) 4 🟢 7 🔴
Start 1751934204.36s - -
End 1751934205.36s - -
Duration 1s (0s) 19 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Braintrust eval report

Say Hi Bot (add-package-manager-1751934209)

Score Average Improvements Regressions
Levenshtein 86.5% (+3pp) 9 🟢 6 🔴
Start 1751934209.51s - -
End 1751934210.51s - -
Duration 1s (+0s) - 19 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@ibolmo ibolmo force-pushed the add-package-manager branch from 34ecc62 to bb949ca Compare July 7, 2025 18:13
@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Braintrust eval report

Say Hi Bot Python (add-package-manager-1751934196)

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751934195.87s (+0s) - -
End 1751934195.88s (+0s) - -
Duration 0s (+0s) 1 🟢 1 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@ibolmo ibolmo force-pushed the add-package-manager branch 7 times, most recently from c77bfc9 to 472d62a Compare July 7, 2025 18:35
@ibolmo ibolmo force-pushed the add-package-manager branch from 472d62a to 5d9a09d Compare July 7, 2025 18:36
"The package manager to use for evals. Valid values: npm, pnpm, yarn, pip,
or uv depending on the runtime."
required: false
default: ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define a default? it looks like if package_manager is "", it goes into an empty case statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's a bit odd, but it's for backward compatibility. I could try without the default '', but ultimately in the code it will fall back to '' due to the zod parsing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh gotchu! makes sense

switch (args.runtime.toLowerCase().trim()) {
case "node":
switch (args.package_manager) {
case "":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if default should be defined and this should be removed -- what happens in this empty case?

Copy link
Contributor Author

@ibolmo ibolmo Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah for backward compatibility it'll be whatever we were doing before. so for node it'll be npx and for python it'll be pip. is that what you had in mind? right now it's just a switch case aliased to npm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeet i forgot about the case statement behavior on no returns -- this makes sense

@daviddkkim daviddkkim self-requested a review July 7, 2025 23:59
Copy link
Contributor

@daviddkkim daviddkkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- just came across an error with yarn.
plz fix when you are ready!

@ibolmo ibolmo merged commit a0d327a into main Jul 8, 2025
7 checks passed
@ibolmo ibolmo deleted the add-package-manager branch July 8, 2025 00:57
@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot Python (main-1751936261)

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751936261.27s (+0s) - -
End 1751936261.27s (+0s) - -
Duration 0s (0s) 1 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot Python (main-1751936260)

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751936260.48s (+0s) - -
End 1751936260.49s (+0s) - -
Duration 0s (+0s) - 2 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Console logging (main-1751936265)

Score Average Improvements Regressions
Levenshtein 82.1% (-1pp) 3 🟢 4 🔴
Start 1751936265.13s - -
End 1751936265.14s - -
Duration 0s (0s) 4 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

My Evaluation (main-1751936265)

Score Average Improvements Regressions
Exact match 100% (+0pp) - -
Start 1751936265.22s - -
End 1751936265.92s - -
Duration 0.7s (+0.69s) - 1 🔴
Prompt_tokens 10tok (+0tok) - -
Completion_tokens 2tok (+0tok) - -
Total_tokens 12tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

Say Hi Bot (main-1751936265)

Score Average Improvements Regressions
Levenshtein 82.1% (-4pp) 2 🟢 8 🔴
Start 1751936265.21s - -
End 1751936266.21s - -
Duration 1s (0s) 19 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot (main-1751936265-955e18da)

Score Average Improvements Regressions
Levenshtein 82.8% (+1pp) 8 🟢 6 🔴
Start 1751936265.37s - -
End 1751936266.37s - -
Duration 1s (+0s) - 20 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot (main-1751936273)

Score Average Improvements Regressions
Levenshtein 81.4% (-1pp) 4 🟢 7 🔴
Start 1751936273.03s - -
End 1751936274.03s - -
Duration 1s (+0s) 1 🟢 11 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot Python ([email protected])

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751936449.88s (+0s) - -
End 1751936449.89s (+0s) - -
Duration 0s (0s) 2 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot Python ([email protected])

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751936450.34s (+0s) - -
End 1751936450.35s (+0s) - -
Duration 0s (+0s) - 2 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot Python ([email protected])

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751936453.59s (+0s) - -
End 1751936453.6s (+0s) - -
Duration 0s (+0s) - 1 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot Python ([email protected])

Score Average Improvements Regressions
Levenshtein 77.8% (+0pp) - -
Start 1751936457.19s (+0s) - -
End 1751936457.2s (+0s) - -
Duration 0s (0s) 1 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Console logging (HEAD-1751936460)

Score Average Improvements Regressions
Levenshtein 83.9% (+2pp) 8 🟢 7 🔴
Start 1751936460.19s - -
End 1751936460.2s - -
Duration 0s (0s) 4 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

My Evaluation (HEAD-1751936460)

Score Average Improvements Regressions
Exact match 100% (+0pp) - -
Start 1751936460.2s - -
End 1751936460.59s - -
Duration 0.38s (-0.32s) 1 🟢 -
Prompt_tokens 10tok (+0tok) - -
Completion_tokens 2tok (+0tok) - -
Total_tokens 12tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

Say Hi Bot (HEAD-1751936460)

Score Average Improvements Regressions
Levenshtein 83.2% (+2pp) 6 🟢 6 🔴
Start 1751936460.25s - -
End 1751936461.25s - -
Duration 1s (0s) 20 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot (HEAD-1751936460-18369f47)

Score Average Improvements Regressions
Levenshtein 83.1% (0pp) 7 🟢 6 🔴
Start 1751936460.3s - -
End 1751936461.3s - -
Duration 1s (0s) 6 🟢 4 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot (HEAD-1751936463)

Score Average Improvements Regressions
Levenshtein 81.3% (-2pp) 4 🟢 8 🔴
Start 1751936463.16s - -
End 1751936464.16s - -
Duration 1s (+0s) - 20 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Console logging (HEAD-1751936469)

Score Average Improvements Regressions
Levenshtein 83.8% (0pp) 6 🟢 5 🔴
Start 1751936468.67s - -
End 1751936468.69s - -
Duration 0s (+0s) - 18 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

My Evaluation (HEAD-1751936469)

Score Average Improvements Regressions
Exact match 100% (+0pp) - -
Start 1751936468.74s - -
End 1751936469.3s - -
Duration 0.56s (+0.17s) - 1 🔴
Prompt_tokens 10tok (+0tok) - -
Completion_tokens 2tok (+0tok) - -
Total_tokens 12tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

Say Hi Bot (HEAD-1751936469)

Score Average Improvements Regressions
Levenshtein 81% (0pp) 7 🟢 7 🔴
Start 1751936468.77s - -
End 1751936469.77s - -
Duration 1s (0s) 17 🟢 -
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot (HEAD-1751936469-91b995a0)

Score Average Improvements Regressions
Levenshtein 82.9% (+2pp) 6 🟢 4 🔴
Start 1751936468.88s - -
End 1751936469.88s - -
Duration 1s (+0s) - 15 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

@github-actions
Copy link

github-actions bot commented Jul 8, 2025

Braintrust eval report

Say Hi Bot (HEAD-1751936470)

Score Average Improvements Regressions
Levenshtein 82.6% (0pp) 6 🟢 6 🔴
Start 1751936470.21s - -
End 1751936471.22s - -
Duration 1s (+0s) 1 🟢 14 🔴
Prompt_tokens 0tok (+0tok) - -
Completion_tokens 0tok (+0tok) - -
Total_tokens 0tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants