Skip to content

Commit 7f4f38c

Browse files
authored
Merge pull request #180 from codelion/release-longcepo
bump version for new release
2 parents a1782bd + 70c0545 commit 7f4f38c

File tree

3 files changed

+27
-27
lines changed

3 files changed

+27
-27
lines changed

README.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -467,31 +467,6 @@ Authorization: Bearer your_secret_api_key
467467

468468
## SOTA results on benchmarks with optillm
469469

470-
### CePO on math and code benchmarks (Mar 2025)
471-
472-
| Method | Math-L5 | MMLU-Pro (Math) | CRUX | LiveCodeBench (pass@1) | Simple QA |
473-
| -----------------------------: | :-----: | :-------------: | :----: | :--------------------: | :-------: |
474-
| Llama 3.3 70B | 51.0 | 78.6 | 72.6 | 27.1 | 20.9 |
475-
| Llama 3.1 405B | 49.8 | 79.2 | 73.0 | 31.8 | 13.5 |
476-
| CePO (using Llama 3.3 70B) | 69.6 | 84.8 | 80.1 | 31.9 | **22.6** |
477-
| QwQ 32B | 61.4 | 90.8 | 82.5 | 44.3 | 7.8 |
478-
| CePO (using QwQ 32B) | 88.1 | **92.0** | 86.3 | **51.5** | 8.2 |
479-
| DeepSeek R1 Llama | 83.1 | 82.0 | 84.0 | 47.3 | 14.6 |
480-
| CePO (using DeepSeek R1 Llama) |**90.2** | 84.0 |**89.4**| 47.2 | 15.5 |
481-
482-
### coc-claude-3-5-sonnet-20241022 on AIME 2024 pass@1 (Nov 2024)
483-
484-
| Model | Score |
485-
|-------|-----:|
486-
| o1-mini | 56.67 |
487-
| coc-claude-3-5-sonnet-20241022 | 46.67 |
488-
| coc-gemini/gemini-exp-1121 | 46.67 |
489-
| o1-preview | 40.00 |
490-
| gemini-exp-1114 | 36.67 |
491-
| claude-3-5-sonnet-20241022 | 20.00 |
492-
| gemini-1.5-pro-002 | 20.00 |
493-
| gemini-1.5-flash-002 | 16.67 |
494-
495470
### LongCePO on LongBench v2 (Apr 2025)
496471

497472
| Model¹ | Context window | Short samples (up to 32K words) | Medium samples (32–128K words) |
@@ -518,6 +493,31 @@ Authorization: Bearer your_secret_api_key
518493

519494
¹ Numbers in parentheses for LongCePO indicate accuracy of majority voting from 5 runs.
520495

496+
### CePO on math and code benchmarks (Mar 2025)
497+
498+
| Method | Math-L5 | MMLU-Pro (Math) | CRUX | LiveCodeBench (pass@1) | Simple QA |
499+
| -----------------------------: | :-----: | :-------------: | :----: | :--------------------: | :-------: |
500+
| Llama 3.3 70B | 51.0 | 78.6 | 72.6 | 27.1 | 20.9 |
501+
| Llama 3.1 405B | 49.8 | 79.2 | 73.0 | 31.8 | 13.5 |
502+
| CePO (using Llama 3.3 70B) | 69.6 | 84.8 | 80.1 | 31.9 | **22.6** |
503+
| QwQ 32B | 61.4 | 90.8 | 82.5 | 44.3 | 7.8 |
504+
| CePO (using QwQ 32B) | 88.1 | **92.0** | 86.3 | **51.5** | 8.2 |
505+
| DeepSeek R1 Llama | 83.1 | 82.0 | 84.0 | 47.3 | 14.6 |
506+
| CePO (using DeepSeek R1 Llama) |**90.2** | 84.0 |**89.4**| 47.2 | 15.5 |
507+
508+
### coc-claude-3-5-sonnet-20241022 on AIME 2024 pass@1 (Nov 2024)
509+
510+
| Model | Score |
511+
|-------|-----:|
512+
| o1-mini | 56.67 |
513+
| coc-claude-3-5-sonnet-20241022 | 46.67 |
514+
| coc-gemini/gemini-exp-1121 | 46.67 |
515+
| o1-preview | 40.00 |
516+
| gemini-exp-1114 | 36.67 |
517+
| claude-3-5-sonnet-20241022 | 20.00 |
518+
| gemini-1.5-pro-002 | 20.00 |
519+
| gemini-1.5-flash-002 | 16.67 |
520+
521521
### readurls&memory-gpt-4o-mini on Google FRAMES Benchmark (Oct 2024)
522522
| Model | Accuracy |
523523
| ----- | -------- |

optillm/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import os
33

44
# Version information
5-
__version__ = "0.1.10"
5+
__version__ = "0.1.11"
66

77
# Get the path to the root optillm.py
88
spec = util.spec_from_file_location(

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
setup(
55
name="optillm",
6-
version="0.1.10",
6+
version="0.1.11",
77
packages=find_packages(include=['optillm', 'optillm.*']), # This ensures all subpackages are included
88
py_modules=['optillm'],
99
package_data={

0 commit comments

Comments
 (0)