How Effective are State Space Models for Machine Translation?

PyTorch implementation of the experiments described in our paper "How Effective are State Space Models for Machine Translation?" [1]. This repository contains code to replicate the results reported in the paper, focusing on the comparison between State Space Models (SSMs) and traditional Transformer architectures for machine translation tasks.

Requirements

To set up the environment for our experiments, follow these steps:

Setup local mamba_ssm package as defined in our fork.
Install required packages in requirements.txt.
Some datasets are obtained from huggingface others are local files. You can check the init function in mt/ds/{dataset}.py to see how to obtain the data.

Training

To train the models described in the paper, use the following command:

python train.py --model mamba --dataset wmt14 --langauge_pair en de --devices 0 --use_padding

Model and dataset names and dataset can be found in models/factory.py and mt/ds/factory.py, respectively. Additonal configuration is set in mt/run.py and through command line arguments defined in utils/mt/argparser.py.

References

[1] Hugo Pitorro*, Pavlo Vasylenko*, Marcos Treviso, André F. T. Martins. "How Effective are State Space Models for Machine Translation?" Submitted to EMNLP 2024.

* Equal contribution

Citation

If you use this code or our results in your research, please cite our paper:

@article{pitorro2024effective,
  title={How Effective are State Space Models for Machine Translation?},
  author={Pitorro, Hugo and Vasylenko, Pavlo and Treviso, Marcos and Martins, André F. T.},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
models		models
mt		mt
notebooks		notebooks
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How Effective are State Space Models for Machine Translation?

Requirements

Training

References

Citation

About

Releases

Packages

Languages

deep-spin/ssm-mt

Folders and files

Latest commit

History

Repository files navigation

How Effective are State Space Models for Machine Translation?

Requirements

Training

References

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages