You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.
Could someone enlighten me as to what the exact difference is between cleaning your data and the use of the .max_seq_len parameter when starting training?
I'm aware that cleaning also involves removing some sentences that are ill-formed in some way and not only sentences that are too long, but what is the exact point of cleaning in the wmt16_en_de.sh script with moses from 1 to 80 if you're then using .max_seq_len afterwards with a value smaller than 80?
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi
Could someone enlighten me as to what the exact difference is between cleaning your data and the use of the .max_seq_len parameter when starting training?
I'm aware that cleaning also involves removing some sentences that are ill-formed in some way and not only sentences that are too long, but what is the exact point of cleaning in the wmt16_en_de.sh script with moses from 1 to 80 if you're then using .max_seq_len afterwards with a value smaller than 80?
The text was updated successfully, but these errors were encountered: