Research: AIGenBench Benchmark Analysis — Architecture, Dataset Overlap & Submission Strategy#67
Open
Vihaan-Singhal1 wants to merge 2 commits intodevelopmentfrom
Open
Research: AIGenBench Benchmark Analysis — Architecture, Dataset Overlap & Submission Strategy#67Vihaan-Singhal1 wants to merge 2 commits intodevelopmentfrom
Vihaan-Singhal1 wants to merge 2 commits intodevelopmentfrom
Conversation
…erlap, submission strategy Adds docs/research/aigenbench_analysis.md covering: - Architecture analysis of top AIGenBench models (ViT-L/14 DINOv2, CLIP, ResNet-50 CLIP) with full math: attention, DINOv2 self-distillation loss, AUROC/F1 formulas - Critical dataset contamination finding: DRAGON consolidates Synthbuster + GenImage (both AIGenBench sources); OpenFake/WildFake overlap 7 of 9 evaluation windows - Clean data strategy for valid leaderboard submission - Testing and submission protocol (PyTorch Lightning framework, DetermAugment pipeline, sliding window evaluation) - Roadmap: upgrade to ViT-L/14 DINOv2, add JPEG compression augmentation, ensemble with FFT Closes #65 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
24-page styled PDF generated from aigenbench_analysis.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #65
This PR delivers
docs/research/aigenbench_analysis.md— a research-paper-quality analysis of the AI-GenBench benchmark covering all three requirements from issue #65.Contents
1. Architecture Analysis of Top-Performing Models
2. Dataset Overlap Analysis (Critical Finding)
3. Testing Strategy and Submission Process
Document Highlights
Actions Recommended
🤖 Generated with Claude Code