Releases: axon-rl/gem
Releases · axon-rl/gem
v0.1.0
What's Changed
- feat:support step for selective subenvs by @vermouthdky in #75
- (fix): fix reset data idx. by @PanAndy in #76
- feat: integrate OpenRLHF training framework by @lkevinzc in #80
- refactor: change verifier to oat.math_grader && allow idx specification by @vermouthdky in #84
- Integrate RL2 with Gem environments by @simonucl in #83
- feat: add 4 more games and random version of each game by @vermouthdky in #86
- feat: align with oat's multi-turn ppo apis by @lkevinzc in #85
- feat: multi-agent env api design by @Benjamin-eecs in #87
- Grpo4 by @anyasims in #89
- feat: support MCP tool and MCPMark env by @cameron-chen in #82
- feat: add visual math environment by @lkevinzc in #93
- feat: add non-multi-processing math grading by @lkevinzc in #95
- feat: integrate tinker-cookbook & add spawn() for lightweight parallel environments by @cameron-chen in #96
- feat: add single-file tinker example by @lkevinzc in #97
- chore: update readme by @lkevinzc in #99
- doc: add bibtex by @lkevinzc in #100
- chore: bump version by @lkevinzc in #101
- chore: fix typo by @lkevinzc in #102
- docs: refactor the examples folder and add readme for tinker by @cameron-chen in #98
New Contributors
- @PanAndy made their first contribution in #76
- @simonucl made their first contribution in #83
- @Benjamin-eecs made their first contribution in #87
Full Changelog: v0.0.4...v0.1.0
v0.0.4
What's Changed
- feat: init the first env by @lkevinzc in #2
- feat: add multi-turn api and stateful obs wrapper by @lkevinzc in #4
- feat: support reasoning_gym as single-turn environments by @lkevinzc in #6
- feat: python tool use by @anyasims in #8
- feat: textarena mastermind and minesweeper by @anyasims in #9
- feat: support vector env by @lkevinzc in #10
- fix: vec env reset & guess the number reward by @lkevinzc in #12
- feat: textarena wordle by @anyasims in #14
- feat: async vector env by @anyasims in #11
- fix: fix guess the number reward spike by @lkevinzc in #15
- fix mastermind by @vermouthdky in #18
- Remove distinction between single-step and multi-step envs and adding math_env by @anyasims in #17
- Wrappers etc by @anyasims in #23
- feat: support code environment with evaluation test by @lkevinzc in #25
- chore: add math12k dataset by @lkevinzc in #26
- fix: fix math verify error by @lkevinzc in #27
- feat: textarena refinements by @vermouthdky in #24
- feat: add code train data; refine async vec env by @lkevinzc in #28
- feat: add ta:FifteenPuzzle by @vermouthdky in #30
- feat: add QA env and support search tool by @cameron-chen in #31
- feat: add logic reasoning (ruletaker) environment by @lkevinzc in #32
- Search tool evaluation, single turn chat w/ tools and some QaEnv fixes by @cameron-chen in #34
- feat: add two qa environments (nq & hotpotqa); add search tool to wrapper factory by @cameron-chen in #35
- ref: improve search engine's perf; enhance qa env & search tool prompt by @cameron-chen in #37
- chore: clean py tool, use tool chat template by @lkevinzc in #41
- chore: clean search tool, sync the reward setting by @cameron-chen in #47
- feat: add oat examples by @lkevinzc in #49
- feat: add more textarena games and refinement by @vermouthdky in #48
- chore: fix imports, add math lvl3-4, better naming for game env by @lkevinzc in #50
- feat: integration with verl framework by @lkevinzc in #51
- feat: standardize python tool offline evaluation by @lkevinzc in #53
- doc: add the example of training on QA env with/without tool by @cameron-chen in #52
- feat: benchmark qa with search tool by @cameron-chen in #54
- chore: add license, prepare examples and package by @lkevinzc in #56
- refine online search server (ggl & serp) by @cameron-chen in #57
- doc: update readme by @cameron-chen in #55
- chore: update link by @lkevinzc in #58
- fix: readme typo by @lkevinzc in #59
- chore: refine example readme by @lkevinzc in #61
- chore: refine example readme by @lkevinzc in #62
- chore: refine example readme by @lkevinzc in #64
- chore: update readme and license by @lkevinzc in #66
- chore: add env_id property to vector_env by @vermouthdky in #67
- fix: randomness when seed is provided for .reset() by @lkevinzc in #68
- chore: fix random on reset and packaging by @lkevinzc in #69
- feat: support terminal environment via docker container by @lkevinzc in #72
New Contributors
- @lkevinzc made their first contribution in #2
- @anyasims made their first contribution in #8
- @vermouthdky made their first contribution in #18
- @cameron-chen made their first contribution in #31
Full Changelog: https://github.com/axon-rl/gem/commits/v0.0.4