Mapping between selected indices(anchor points) and original dataset

Hi, thank you for your great work!

I'm currently trying to reproduce your results. I noticed that the code in the repository uses a preprocessed dataset(`data/lb.pickle`), and the results are presented as selected indices(`tutorials/anchor_points.ipynb`)

However, I'm having trouble figuring out how these indices correspond to the original dataset, such as GSM8K. I’ve also checked the Hugging Face leaderboard you provided, but possibly due to version updates, I wasn’t able to find the exact models or datasets used in your code(e.g. `open-llm-leaderboard/details_moreh__MoMo-72B-lora-1.8.7-DPO` in `tinyBenchmarks_MMLU_demo.ipynb`).

Could you please clarify how to map the selected indices back to the original data samples? Are there any scripts, metadata, or versioned dataset references available for this?

Any guidance would be greatly appreciated. Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mapping between selected indices(anchor points) and original dataset #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mapping between selected indices(anchor points) and original dataset #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions