G-XLT performance for MLQA #11

wasiahmad · 2020-03-19T17:15:45Z

I am trying to reproduce the results presented in table 6 of the paper for generalized XLT using M-BERT.

I have done the following.

Fine-tuned M-BERT using only the SQuAD 1.1 training dataset and validated on MLQA English dev dataset.
I have used the window approach as used by the BERT authors (Extracting features on for long sequences / SQuAD google-research/bert#66) for long sequences. I set the maximum sequence length to 384, doc stride to 128.
I considered the maximum answer length = 30.
During fine-tuning, I used the following setting.

learning rate = 5e-5
warmup_steps = 0
epochs = 3
gradient_accumulation_steps = 1
grad_clipping = 1.0

I got the following result. As you can see the performance is very poor particularly for Hindi and Vietnamese language. I think a different inference algorithm is used in your work. Is it possible to briefly explain what you did during inference?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

G-XLT performance for MLQA #11

G-XLT performance for MLQA #11

wasiahmad commented Mar 19, 2020 •

edited

Loading

G-XLT performance for MLQA #11

G-XLT performance for MLQA #11

Comments

wasiahmad commented Mar 19, 2020 • edited Loading

wasiahmad commented Mar 19, 2020 •

edited

Loading