Skip to content

Conversation

@RitzChow
Copy link

I've added the Seephys benchmark.

#payload["reasoning_effort"] = "medium"
payload["response_format"] = {"type": "text"}
payload["max_completion_tokens"] = gen_kwargs["max_new_tokens"]
payload["max_completion_tokens"] = 5000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hardcoding

del payload["temperature"]
payload.pop("max_tokens")
payload["reasoning_effort"] = "medium"
#payload["reasoning_effort"] = "medium"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think possibly should add a control args for reasoning effort for openai compatible instead of direct comment out it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize, this part is a temporary modification I made for testing purposes. You can ignore this modification and only focus on the newly added task.

Copy link
Collaborator

@kcz358 kcz358 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, most of the part LGTM. Just a few commends for the revision in openai compatible. Thanks for the contribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants