flan-t5-base-samsum / README.md
texasdave2's picture
End of training
8a68712
metadata
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
datasets:
  - samsum
metrics:
  - rouge
model-index:
  - name: flan-t5-base-samsum
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: samsum
          type: samsum
          config: samsum
          split: test
          args: samsum
        metrics:
          - name: Rouge1
            type: rouge
            value: 47.1046

flan-t5-base-samsum

This model is a fine-tuned version of google/flan-t5-base on the samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3859
  • Rouge1: 47.1046
  • Rouge2: 23.264
  • Rougel: 39.2757
  • Rougelsum: 43.2598
  • Gen Len: 17.3333

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.5121 0.08 50 1.4287 46.7868 22.863 38.971 42.8209 16.9634
1.46 0.16 100 1.4199 46.8031 22.8195 39.0708 42.8717 17.2393
1.4515 0.24 150 1.4147 46.6849 23.0376 38.9434 42.8344 17.1245
1.4679 0.33 200 1.4121 46.8756 22.8504 39.1671 43.1892 17.3431
1.451 0.41 250 1.4109 46.8572 23.09 39.2939 43.2955 17.2686
1.4434 0.49 300 1.4040 46.6829 23.071 39.3131 43.1432 16.9158
1.4417 0.57 350 1.4007 46.8637 23.0661 39.2462 43.1897 17.1172
1.4781 0.65 400 1.3952 46.8511 23.1134 39.3071 43.2164 17.2076
1.4626 0.73 450 1.3940 47.1533 23.2771 39.3094 43.2806 17.2222
1.4307 0.81 500 1.3955 46.9527 23.2227 39.2844 43.1903 17.2002
1.4586 0.9 550 1.3933 46.7523 23.1759 39.2675 43.1588 17.3040
1.4465 0.98 600 1.3905 46.855 23.3518 39.2879 43.2145 17.3468
1.381 1.06 650 1.3953 46.9719 22.9788 39.0886 43.1892 17.4066
1.4125 1.14 700 1.3922 46.535 23.0956 38.9275 42.9811 17.2381
1.3667 1.22 750 1.3922 47.3311 23.4123 39.5412 43.5624 17.2930
1.3878 1.3 800 1.3953 46.6737 23.2153 39.2982 43.2596 17.3358
1.3884 1.38 850 1.3931 46.9764 23.1561 39.1606 43.2115 17.3614
1.3766 1.47 900 1.3898 47.0466 23.1674 39.2822 43.293 17.3333
1.3727 1.55 950 1.3889 46.7311 23.0837 39.0882 43.0072 17.3211
1.4001 1.63 1000 1.3859 47.1046 23.264 39.2757 43.2598 17.3333
1.3894 1.71 1050 1.3874 47.2479 23.3762 39.4723 43.5241 17.3297
1.3697 1.79 1100 1.3860 47.1037 23.3894 39.3848 43.3875 17.3504
1.3886 1.87 1150 1.3862 47.0714 23.3937 39.4181 43.3841 17.3260
1.4037 1.95 1200 1.3861 47.0725 23.4085 39.3575 43.3676 17.3321

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.0.0+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3