-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Hi, thank you for your great work!
I'm currently trying to reproduce your results. I noticed that the code in the repository uses a preprocessed dataset(data/lb.pickle), and the results are presented as selected indices(tutorials/anchor_points.ipynb)
However, I'm having trouble figuring out how these indices correspond to the original dataset, such as GSM8K. I’ve also checked the Hugging Face leaderboard you provided, but possibly due to version updates, I wasn’t able to find the exact models or datasets used in your code(e.g. open-llm-leaderboard/details_moreh__MoMo-72B-lora-1.8.7-DPO in tinyBenchmarks_MMLU_demo.ipynb).
Could you please clarify how to map the selected indices back to the original data samples? Are there any scripts, metadata, or versioned dataset references available for this?
Any guidance would be greatly appreciated. Thanks in advance!