Skip to content

Release of the evaluation dataset for world state model #2

@hxhcreate

Description

@hxhcreate

Hi, authors, wonderful work.

I notice you have validated the world state model's performance on three datasets:

  • agentrewardbench: already open-sourced
  • OSworld-full trajectories
  • Prof/Office trajectories

Since we are trying to make a fair comparison, your release of the last two evaluation datasets will help us a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions