This repository implements the experiments from "Continuous Mask Predict: Enabling Better Control for Parallel Decoding Iterative Refinement."
The code is based off of the implementation of Mask-Predict: Parallel Decoding of Conditional Masked Language Models available here.
text=PATH_YOUR_DATA
output_dir=PATH_YOUR_OUTPUT
src=source_language
tgt=target_language
model_path=PATH_TO_MASKPREDICT_MODEL_DIR
python preprocess.py --source-lang ${src} --target-lang ${tgt} --trainpref $text/train --validpref $text/valid --testpref
Use the run_train.sh
script to launch training jobs on the G2 cluster.
For usage type:
run_train.sh --help
Schedule classifier training using
sbatch train_classifier.sh
.
Schedule AR training using
sbatch train_AR.sh
.
Use the [run_generate.sh
] script to perform evaluation.
For usage type:
run_generate.sh --help
MASK-PREDICT is CC-BY-NC 4.0. The license applies to the pre-trained models as well.