Characterizing the Implicit Bias of Regularized SGD in Rank Minimization

Requirements

Python 3.10
Pytorch 1.11
Numpy
Tqdm

Running Experiments

There are three main ways to run experiments. The jobs and associated hyperparameters will be logged in experiments.csv.

To submit the code as a job to slurm:
```
sh train.sh
```
To submit a sweep of hyperparameters to queue to slurm:
```
python sweep.py
```
To directly run the code (not recommended except for testing/debugging):
```
python train.py
```

Other Commands

To host a Jupyter notebook server:

sbatch jupyter/jupyter.sh

To resubmit a failed experiment to slurm, resuming from the last checkpoint:

sbatch --export=ALL,xid=[XID] --job-name=[XID]_train retrain.sh

To find and replace xids within a Jupyter notebook:

sh jupyter/replace.sh < diffout.txt > results_plot.ipynb

Other Files

jupyter/sweep_plot.ipynb: New Jupyter notebook that plots results of experiments specified by hyperparameters.
jupyter/results_plot.ipynb: Jupyter notebook used to plot the results of experiments.
conf/global_settings.py: A file that specifies the configuration parameters and hyperparameters.
log_settings.py: Logs the current state of settings and saves to CSV.
analysis_convergence.py: Contains functions that help in measuring the distance between the weights at epoch T and T+1.
analysis_rank.py: Contains functions that help in measuring the ranks of the various matrices in the network.
utils.py: Contains functions responsible for saving data, loading datasets, etc.
models: Contains implementations of networks used in training.

Reference

If you found this code useful, please cite the following paper:

@misc{galanti2023characterizingimplicitbiasregularized,
      title={SGD and Weight Decay Secretly Compress Your Neural Network}, 
      author={Tomer Galanti and Zachary S. Siegel and Aparna Gupte and Tomaso Poggio},
      year={2024},
      eprint={2206.05794},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2206.05794}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Characterizing the Implicit Bias of Regularized SGD in Rank Minimization

Requirements

Running Experiments

Other Commands

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
conf		conf
jupyter		jupyter
models		models
Readme.md		Readme.md
analysis_convergence.py		analysis_convergence.py
analysis_rank.py		analysis_rank.py
experiments.csv		experiments.csv
file_cleanup.sh		file_cleanup.sh
log_settings.py		log_settings.py
mass_retrain.sh		mass_retrain.sh
program.sh		program.sh
retrain.sh		retrain.sh
sweep.py		sweep.py
train.py		train.py
train.sh		train.sh
utils.py		utils.py

TomerGalanti/LowRankBias

Folders and files

Latest commit

History

Repository files navigation

Characterizing the Implicit Bias of Regularized SGD in Rank Minimization

Requirements

Running Experiments

Other Commands

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages