Skip to content

MingzeDong/r2mae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

R²MAE: Understanding and Enhancing Mask-Based Pretraining

This repository contains the code for the NeurIPS 2025 paper Understanding and Enhancing Mask-Based Pretraining towards Universal Representations.

The paper presents a working theory of mask pretraining schemes (i.e. MIM, MAE) using high-dimensional linear regression, and proposes an embarrassingly simple improvement guided by the theory. The theoretical framework and its implications have been validated across diverse neural architectures (including MLPs, CNNs, and Transformers) applied to both vision and language tasks. The proposed improvement, termed R²MAE, is implemented in vision, language, DNA sequence, and single-cell models, where it consistently outperforms standard and more complicated masking schemes.

@article{dong2025understanding,
  title={Understanding and Enhancing Mask-Based Pretraining towards Universal Representations},
  author={Dong, Mingze and Wang, Leda and Kluger, Yuval},
  journal={The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)},
  year={2025}
}

Overview

Table of Contents:

  • notebooks contains Jupyter notebooks for reproducing plots and results in the paper.
  • src contains R²MAE implementations for vision MAE, RoBERTa, GPN-MSA, and single-cell models. Note: Most modifications are in place, therefore downloading original repositories may also be required.
    • mae is a modification of facebookresearch/mae that implements R²MAE.
      • engine_pretrain_customized.py is the modified version of the original engine_pretrain.py that implements R²MAE.
      • main_pretrain_customized.py is the modified version of the original main_pretrain_customized.py.
      • Several files were slightly modified for dependency issues.
    • roberta is a modification of the RoBERTa model implementation in huggingface/transformers/examples/pytorch/language-modeling (v4.52.4) that implements R²MAE.
      • run_mlm_customized.py is the modified version of the original run_mlm.py.
      • datacollator_customized.py should be placed in the same folder transformers/examples/pytorch/language-modeling to enable override.
    • gpn-msa is a modification of gpn-msa in songlab-cal/gpn/gpn (v0.6) that implements R²MAE and alternative baselines. The majority of modification is to enable CL-MAE (named as "map" in the repository), and a R²MAE-only version would be much simpler.
      • model_map.py is the modified version of the original model.py.
      • msa_map contains the modified version of train.py.
    • single_cell is a R²MAE implementation in single-cell MAE models implemented via scvi-tools (v0.16.0).
      • scVIMaskModel class supports R²MAE, Dynamic MR, etc.
  • scripts contains model training scripts for vision MAE, RoBERTa, GPN-MSA and single-cell models.

About

[NeurIPS 2025] Understanding and Enhancing Mask-Based Pretraining towards Universal Representations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors