Skip to content

Commit 34aa068

Browse files
committed
update results
1 parent 27b5930 commit 34aa068

File tree

1 file changed

+5
-10
lines changed

1 file changed

+5
-10
lines changed

README.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -24,16 +24,11 @@ We propose **CTRL**, a framework that trains LLMs to critique **without human su
2424

2525
## 🎯 Key Results
2626

27-
<div style="display: flex; justify-content: center; gap: 1px; margin-top: 20px;">
28-
<figure style="text-align: center; width: 48%;">
29-
<img src="https://critic-rl.github.io/static/images/scaling_v3.png" style="width: 100%;" alt="Test-time Scaling Performance">
30-
<figcaption>Pass@1 improves substantially after multi-turn iterations with CTRL critics.</figcaption>
31-
</figure>
32-
<figure style="text-align: center; width: 48%;">
33-
<img src="https://critic-rl.github.io/static/images/c2w.png" style="width: 100%;" alt="Error Compounding Analysis">
34-
<figcaption>CTRL maintains lower correct→incorrect rates across iterations compared to baselines.</figcaption>
35-
</figure>
36-
</div>
27+
<p align="center">
28+
<img alt="Light" src="https://critic-rl.github.io/static/images/scaling_v3.png" width="45%">
29+
&nbsp; &nbsp; &nbsp; &nbsp;
30+
<img alt="Dark" src="https://critic-rl.github.io/static/images/c2w.png" width="45%">
31+
</p>
3732

3833
- **Test-time Scaling**: Qwen2.5-Coder-32B-Ins with the CTRL critic achieves 106.1% relative improvement in Pass@1 on CodeContests through multi-turn critique-revision, while maintaining low error rates across iterations
3934

0 commit comments

Comments
 (0)