Skip to content

Latest commit

 

History

History
50 lines (35 loc) · 2.45 KB

short_seq_evaluation_en.md

File metadata and controls

50 lines (35 loc) · 2.45 KB

Finetune on Short Sequence Dataset

한국어 | English

Details

  • KoBigBird performance evaluation in max_seq_length<=512 setting

  • Evaluated with a total of 5 Datasets

    • Single Sentence Classification: NSMC
    • Sentence Pair Classification: KLUE-NLI, KLUE-STS
    • Question Answering: Korquad 1.0, KLUE-MRC
  • Based on the KLUE-Baseline code with some modifications

    • Add nsmc and korquad 1.0 tasks
    • Fix to be compatible with transformers==4.11.3
  • Sequence Classification is trained with a length of 128 and Question Answering with a length of 512

    • Full Attention instead of Sparse Attention (Automatically changed to Full Attention with the following log)
    Attention type 'block_sparse' is not possible if sequence_length: 300 <= num global tokens: 2 * config.block_size + min. num sliding tokens: 3 * config.block_size
    + config.num_random_blocks * config.block_size + additional buffer: config.num_random_blocks * config.block_size = 704 with config.block_size = 64, config.num_random_blocks = 3.
    Changing attention type to 'original_full'...
    

Result

NSMC
(acc)
KLUE-NLI
(acc)
KLUE-STS
(pearsonr)
Korquad 1.0
(em/f1)
KLUE MRC
(em/rouge-w)
KoELECTRA-Base-v3 91.13 86.87 93.14 85.66 / 93.94 59.54 / 65.64
KLUE-RoBERTa-Base 91.16 86.30 92.91 85.35 / 94.53 69.56 / 74.64
KoBigBird-BERT-Base 91.18 87.17 92.61 87.08 / 94.71 70.33 / 75.34
  • KLUE and Korquad 1.0 are evaluated with dev set.
  • For KoELECTRA-Base-v3 and KLUE-RoBERTa-Base, we brought the KLUE dataset score from A. Dev Set Results in KLUE Paper.

Reference