Skip to content

Mapping between selected indices(anchor points) and original dataset #13

@Castria-cn

Description

@Castria-cn

Hi, thank you for your great work!

I'm currently trying to reproduce your results. I noticed that the code in the repository uses a preprocessed dataset(data/lb.pickle), and the results are presented as selected indices(tutorials/anchor_points.ipynb)

However, I'm having trouble figuring out how these indices correspond to the original dataset, such as GSM8K. I’ve also checked the Hugging Face leaderboard you provided, but possibly due to version updates, I wasn’t able to find the exact models or datasets used in your code(e.g. open-llm-leaderboard/details_moreh__MoMo-72B-lora-1.8.7-DPO in tinyBenchmarks_MMLU_demo.ipynb).

Could you please clarify how to map the selected indices back to the original data samples? Are there any scripts, metadata, or versioned dataset references available for this?

Any guidance would be greatly appreciated. Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions