Can the model see the issue's GitHub discussion thread? #61

js-d · 2025-03-13T18:13:06Z

Based on Appendix A.9 in the paper, issue 381 with the missing zip-code validation error seems to be graded based on whether the model improves broader validation (e.g., country-specific regex dictionaries), as discussed in the issue's Github thread.

Does the model get to see the original GitHub discussion? If not, wouldn’t it be unfair to penalize the model for only fixing the issue as it was stated originally?

I might be misunderstanding what information the model is given in context: is it only: what's in issue_data.json, the state of the repo, and what it can glean from the user_tool?

This seems particularly important because AFAICT issue 381 is the IC-SWE issue with the highest payout.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can the model see the issue's GitHub discussion thread? #61

Can the model see the issue's GitHub discussion thread? #61

js-d commented Mar 13, 2025 •

edited

Loading

Can the model see the issue's GitHub discussion thread? #61

Can the model see the issue's GitHub discussion thread? #61

Comments

js-d commented Mar 13, 2025 • edited Loading

js-d commented Mar 13, 2025 •

edited

Loading