You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Terjman-Ultra-v2.1

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3692
  • Bleu: 0.6174
  • Gen Len: 12.1553

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
11.5533 0.1402 1000 4.1490 0.218 11.8129
9.4281 0.2805 2000 3.4144 0.4264 12.0859
8.8831 0.4207 3000 3.3817 0.6161 12.1518
9.0489 0.5610 4000 3.3748 0.4762 12.14
8.8534 0.7012 5000 3.3727 0.4529 12.2059
8.5959 0.8415 6000 3.3716 0.5023 12.1741
9.1659 0.9817 7000 3.3705 0.4831 12.1694
8.8922 1.1219 8000 3.3707 0.4863 12.1541
9.05 1.2621 9000 3.3705 0.6146 12.1812
8.9628 1.4024 10000 3.3708 0.476 12.2906
8.7346 1.5426 11000 3.3697 0.4825 12.1541
9.508 1.6829 12000 3.3699 0.4869 12.1471
9.1153 1.8231 13000 3.3701 0.4592 12.1635
8.8908 1.9634 14000 3.3701 0.64 12.2047
9.01 2.1035 15000 3.3697 0.4836 12.1553
9.5064 2.2438 16000 3.3701 0.6171 12.1765
9.1053 2.3840 17000 3.3700 0.4589 12.1482
8.9289 2.5242 18000 3.3696 0.6097 12.1588
9.0108 2.6645 19000 3.3696 0.6209 12.3306
9.4611 2.8047 20000 3.3700 0.4831 12.1894
9.1424 2.9450 21000 3.3697 0.4829 12.1612
9.1961 3.0851 22000 3.3693 0.6056 12.1671
8.7144 3.2254 23000 3.3697 0.6108 12.3435
9.0528 3.3656 24000 3.3700 0.477 12.1871
8.8442 3.5059 25000 3.3693 0.4587 12.1588
9.3264 3.6461 26000 3.3701 0.615 12.1624
8.7752 3.7864 27000 3.3697 0.4807 12.24
8.9259 3.9266 28000 3.3695 0.4621 12.1929
9.0475 4.0668 29000 3.3699 0.5017 12.1906
8.7334 4.2070 30000 3.3701 0.4808 12.1824
8.8332 4.3473 31000 3.3693 0.4726 12.2518
9.0679 4.4875 32000 3.3697 0.4642 12.2047
9.2859 4.6277 33000 3.3691 0.6193 12.1671
8.7846 4.7680 34000 3.3697 0.615 12.1718
9.0892 4.9082 35000 3.3692 0.6174 12.1553

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
32
Safetensors
Model size
1.37B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BounharAbdelaziz/Terjman-Ultra-v2.1

Finetuned
(4)
this model