R²MAE: Understanding and Enhancing Mask-Based Pretraining

This repository contains the code for the NeurIPS 2025 paper Understanding and Enhancing Mask-Based Pretraining towards Universal Representations.

The paper presents a working theory of mask pretraining schemes (i.e. MIM, MAE) using high-dimensional linear regression, and proposes an embarrassingly simple improvement guided by the theory. The theoretical framework and its implications have been validated across diverse neural architectures (including MLPs, CNNs, and Transformers) applied to both vision and language tasks. The proposed improvement, termed R²MAE, is implemented in vision, language, DNA sequence, and single-cell models, where it consistently outperforms standard and more complicated masking schemes.

@article{dong2025understanding,
  title={Understanding and Enhancing Mask-Based Pretraining towards Universal Representations},
  author={Dong, Mingze and Wang, Leda and Kluger, Yuval},
  journal={The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)},
  year={2025}
}

Overview

Table of Contents:

notebooks contains Jupyter notebooks for reproducing plots and results in the paper.
src contains R²MAE implementations for vision MAE, RoBERTa, GPN-MSA, and single-cell models. Note: Most modifications are in place, therefore downloading original repositories may also be required.
- mae is a modification of facebookresearch/mae that implements R²MAE.
  - engine_pretrain_customized.py is the modified version of the original engine_pretrain.py that implements R²MAE.
  - main_pretrain_customized.py is the modified version of the original main_pretrain_customized.py.
  - Several files were slightly modified for dependency issues.
- roberta is a modification of the RoBERTa model implementation in huggingface/transformers/examples/pytorch/language-modeling (v4.52.4) that implements R²MAE.
  - run_mlm_customized.py is the modified version of the original run_mlm.py.
  - datacollator_customized.py should be placed in the same folder transformers/examples/pytorch/language-modeling to enable override.
- gpn-msa is a modification of gpn-msa in songlab-cal/gpn/gpn (v0.6) that implements R²MAE and alternative baselines. The majority of modification is to enable CL-MAE (named as "map" in the repository), and a R²MAE-only version would be much simpler.
  - model_map.py is the modified version of the original model.py.
  - msa_map contains the modified version of train.py.
- single_cell is a R²MAE implementation in single-cell MAE models implemented via scvi-tools (v0.16.0).
  - scVIMaskModel class supports R²MAE, Dynamic MR, etc.
scripts contains model training scripts for vision MAE, RoBERTa, GPN-MSA and single-cell models.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
notebooks		notebooks
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R²MAE: Understanding and Enhancing Mask-Based Pretraining

Overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

R²MAE: Understanding and Enhancing Mask-Based Pretraining

Overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages