eng-fra

  • source language name: English

  • target language name: French

  • OPUS readme: README.md

  • model: transformer-align

  • source language code: en

  • target language code: fr

  • dataset: opus

  • release date: 2021-02-22

  • pre-processing: normalization + SentencePiece (spm32k,spm32k)

  • download original weights: opus-2021-02-22.zip

  • Training data:

    • fra-eng: Tatoeba-train (180923857)
  • Validation data:

    • eng-fra: Tatoeba-dev, 250098
    • total-size-shuffled: 249757
    • devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
  • Test data:

    • newsdiscussdev2015-enfr.eng-fra: 1500/27986
    • newsdiscusstest2015-enfr.eng-fra: 1500/28027
    • newssyscomb2009.eng-fra: 502/12334
    • news-test2008.eng-fra: 2051/52685
    • newstest2009.eng-fra: 2525/69278
    • newstest2010.eng-fra: 2489/66043
    • newstest2011.eng-fra: 3003/80626
    • newstest2012.eng-fra: 3003/78011
    • newstest2013.eng-fra: 3000/70037
    • Tatoeba-test.eng-fra: 10000/80769
    • tico19-test.eng-fra: 2100/64655
  • test set translations file: test.txt

  • test set scores file: eval.txt

  • BLEU-scores

    Test set score
    Tatoeba-test.eng-fra 50.8
    tico19-test.eng-fra 41.8
    newsdiscusstest2015-enfr.eng-fra 40.8
    newstest2011.eng-fra 34.6
    newsdiscussdev2015-enfr.eng-fra 33.9
    newstest2013.eng-fra 33.5
    newstest2010.eng-fra 33.0
    newstest2012.eng-fra 32.0
    newssyscomb2009.eng-fra 30.0
    newstest2009.eng-fra 29.9
    news-test2008.eng-fra 27.5
  • chr-F-scores

    Test set score
    Tatoeba-test.eng-fra 0.671
    newsdiscusstest2015-enfr.eng-fra 0.649
    tico19-test.eng-fra 0.638
    newstest2011.eng-fra 0.614
    newsdiscussdev2015-enfr.eng-fra 0.606
    newstest2010.eng-fra 0.599
    newstest2012.eng-fra 0.593
    newstest2013.eng-fra 0.591
    newssyscomb2009.eng-fra 0.587
    newstest2009.eng-fra 0.58
    news-test2008.eng-fra 0.556

System Info:

  • hf_name: eng-fra
  • source_languages: en
  • target_languages: fr
  • opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/eng-fra/opus-2021-02-22.zip/README.md
  • original_repo: Tatoeba-Challenge
  • tags: ['translation']
  • languages: ['en', 'fr']
  • src_constituents: ['eng']
  • tgt_constituents: ['fra']
  • src_multilingual: False
  • tgt_multilingual: False
  • helsinki_git_sha: 6faf2dab0b7b01a0e08a114dbacbb7deac54988d
  • transformers_git_sha: e9a6c72b5edfb9561a981959b0e7c62d8ab9ef6c
  • port_machine: 146-193-182-187.edr.inesc.pt
  • port_time: 2023-11-08-11:42
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.