Code for paper: "Variance Reduced Local SGD with Lower Communication Complexity"

Dependencies and Setup

All code runs on Python 3.6.7 using PyTorch version 1.1.0.

In addition, you will need to install

torchvision
torchtext
numpy
pandas

Preprocess Data

Db Pedia

Download the data from link and extract it to the current directory. Then you can get two files: train.csv and test.csv.
Modify the data path in process_text.py and execute process_text.py.

Tiny ImageNet

Download the data from link and extract it to the current directory.
Modify the data path in process_tiny_magenet.py and execute process_tiny_magenet.py.

Running Experiments

There are two main scripts:

train.sh for training using S-SGD, Local SGD and VRL-SGD.
plot_all.sh for plotting figure.

Description of main parameters

--lr learning rate
--model model name, model: lenet5, text_cnn, mlp.
--data-set dataset name, model: mnist, DB_Pedia, tiny_imagenet.
--epochs the number of epochs for running.
--gpu-num the number of GPUs.
--batch-size batch size for each machine.
-r resume the training.
local whether to communicate periodically.
--period the communication period. If --local is not set, then it will always be 1.
--cluster-data each worker only accesses a sub of data.
--vrl whether to execute the VRLSGD algorithm.

Warm Up

We recommend performing 2 epoch SGD to initialize the weights. If not, the -r parameter cannot be used. After the initialization is completed, modify the file name, for example, change the file lenet5.pth to lenet5_init.pth.

LeNet on MNIST

Non-Identical Case

# S-SGD
python main.py --lr 0.005 --model lenet5 --dataset mnist --epochs 100  --st 0 -s 1 --gpu-num 8 -r --port 6632 --cluster-data
# Local-SGD
python main.py --lr 0.005 --model lenet5 --dataset mnist --epochs 100  --st 0 -s 1 --gpu-num 8 -r --port 6633 --cluster-data --local --period 20
# VRL-SGD
python main.py --lr 0.005 --model lenet5 --dataset mnist --epochs 100  --st 0 -s 1 --gpu-num 8 -r --port 6634 --cluster-data --local --period 20 --vrl

Identical Case

# S-SGD
python main.py --lr 0.005 --model lenet5 --dataset mnist --epochs 100  --st 0 -s 1 --gpu-num 8 -r --port 6632
# Local-SGD
python main.py --lr 0.005 --model lenet5 --dataset mnist --epochs 100  --st 0 -s 1 --gpu-num 8 -r --port 6633 --local --period 20
# VRL-SGD
python main.py --lr 0.005 --model lenet5 --dataset mnist --epochs 100  --st 0 -s 1 --gpu-num 8 -r --port 6634 --local --period 20 --vrl

TextCNN on on DBPedia

Non-Identical Case

# S-SGD
python main.py --lr 0.01 --model text_cnn --dataset DB_Pedia --epochs 100 --st 0 -s 1 --gpu-num 8 --port 6632 --batch-size 512 -r --cluster-data
# Local-SGD
python main.py --lr 0.01 --model text_cnn --dataset DB_Pedia --epochs 100 --st 0 -s 1 --gpu-num 8 --port 6632 --batch-size 512 -r --cluster-data --local --period 50
# VRL-SGD
python main.py --lr 0.01 --model text_cnn --dataset DB_Pedia --epochs 100 --st 0 -s 1 --gpu-num 8 --port 6632 --batch-size 512 -r --cluster-data --local --period 50 --vrl

Identical Case

# S-SGD
python main.py --lr 0.01 --model text_cnn --dataset DB_Pedia --epochs 100 --st 0 -s 1 --gpu-num 8 --port 6632 --batch-size 512 -r 
# Local-SGD
python main.py --lr 0.01 --model text_cnn --dataset DB_Pedia --epochs 100 --st 0 -s 1 --gpu-num 8 --port 6632 --batch-size 512 -r  --local --period 50
# VRL-SGD
python main.py --lr 0.01 --model text_cnn --dataset DB_Pedia --epochs 100 --st 0 -s 1 --gpu-num 8 --port 6632 --batch-size 512 -r  --local --period 50 --vrl

Transfer Learning on tiny ImageNet

Non-Identical Case

# S-SGD
python main.py --lr 0.025 --model mlp --dataset tiny_imagenet --epochs 300 -s 1 --gpu-num 8 --port 6632 --batch-size 256 -r  --cluster-data 
# Local-SGD
python main.py --lr 0.025 --model mlp --dataset tiny_imagenet --epochs 300 -s 1 --gpu-num 8 --port 6633 --batch-size 256 -r  --local  --period 20  --cluster-data
# VRL-SGD
python main.py --lr 0.025 --model mlp --dataset tiny_imagenet --epochs 300 -s 1 --gpu-num 8 --port 6634 --batch-size 256 -r  --local  --period 20 --vrl--cluster-data

Identical Case

# S-SGD
python main.py --lr 0.025 --model mlp --dataset tiny_imagenet --epochs 300 -s 1 --gpu-num 8 --port 6632 --batch-size 256 -r 
# Local-SGD
python main.py --lr 0.025 --model mlp --dataset tiny_imagenet --epochs 300 -s 1 --gpu-num 8 --port 6633 --batch-size 256 -r  --local  --period 20  
# VRL-SGD
python main.py --lr 0.025 --model mlp --dataset tiny_imagenet --epochs 300 -s 1 --gpu-num 8 --port 6634 --batch-size 256 -r  --local  --period 20 --vrl

Name	Name	Last commit message	Last commit date
Latest commit zerolxf Update Readme.md May 20, 2020 8fb56da · May 20, 2020 History 4 Commits
figure	figure	Initialization	Sep 23, 2019
record	record	Initialization	Sep 23, 2019
DistributedSgd.py	DistributedSgd.py	Initialization	Sep 23, 2019
Readme.md	Readme.md	Update Readme.md	May 20, 2020
UnshuffleSampler.py	UnshuffleSampler.py	Initialization	Sep 23, 2019
main.py	main.py	Initialization	Sep 23, 2019
myutils.py	myutils.py	Initialization	Sep 23, 2019
plot.py	plot.py	Initialization	Sep 23, 2019
plot_all.sh	plot_all.sh	Initialization	Sep 23, 2019
process_text.py	process_text.py	Initialization	Sep 23, 2019
process_tiny_magenet.py	process_tiny_magenet.py	Initialization	Sep 23, 2019
restarted.py	restarted.py	Initialization	Sep 23, 2019
textCNN.py	textCNN.py	Initialization	Sep 23, 2019
train.sh	train.sh	Initialization	Sep 23, 2019
trainer.py	trainer.py	Initialization	Sep 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code for paper: "Variance Reduced Local SGD with Lower Communication Complexity"

Dependencies and Setup

Preprocess Data

Db Pedia

Tiny ImageNet

Running Experiments

Description of main parameters

Warm Up

LeNet on MNIST

Non-Identical Case

Identical Case

TextCNN on on DBPedia

Non-Identical Case

Identical Case

Transfer Learning on tiny ImageNet

Non-Identical Case

Identical Case

About

Releases

Packages

Contributors 2

Languages

zerolxf/VRL-SGD

Folders and files

Latest commit

History

Repository files navigation

Code for paper: "Variance Reduced Local SGD with Lower Communication Complexity"

Dependencies and Setup

Preprocess Data

Db Pedia

Tiny ImageNet

Running Experiments

Description of main parameters

Warm Up

LeNet on MNIST

Non-Identical Case

Identical Case

TextCNN on on DBPedia

Non-Identical Case

Identical Case

Transfer Learning on tiny ImageNet

Non-Identical Case

Identical Case

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages