Skip to content

Latest commit

 

History

History
12 lines (6 loc) · 287 Bytes

README.md

File metadata and controls

12 lines (6 loc) · 287 Bytes

wmt19-tmx-extract

Extraction script for Paracrawl tmx files for use in WMT19

Adapted from Apertium TMX tools http://wiki.apertium.org/wiki/Tools_for_TMX

use:

gzcat tmx.gz | ./tmx-extract-parallel.py -b base_filename -s src_lang -t tgt_lang -c (optional - removes crlf in )