You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Terjman-Nano-v2.0-512

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ar on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.5184
  • Bleu: 2.2033
  • Gen Len: 11.8452

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
9.7501 0.5635 1000 6.1316 0.6938 11.4881
7.6553 1.1268 2000 5.4289 1.0007 10.4235
6.7699 1.6903 3000 5.0480 1.2106 10.835
6.1954 2.2536 4000 4.7737 1.7367 10.9592
6.1181 2.8171 5000 4.6615 2.0549 11.1088
5.9065 3.3804 6000 4.6103 2.0903 10.9507
5.9562 3.9439 7000 4.5762 2.1818 11.7874
5.8241 4.5072 8000 4.5597 2.2354 11.7704
5.9395 5.0704 9000 4.5460 2.2286 11.7874
5.7882 5.6340 10000 4.5379 2.2387 11.7925
5.6615 6.1972 11000 4.5317 2.2517 11.8214
5.7702 6.7608 12000 4.5299 2.2277 11.8095
5.7982 7.3240 13000 4.5274 2.248 11.8248
5.8247 7.8876 14000 4.5253 2.2272 11.7262
5.7944 8.4508 15000 4.5242 2.2102 11.7483
5.7937 9.0141 16000 4.5219 2.0453 11.8248
5.8086 9.5776 17000 4.5242 2.2002 11.7942
5.7118 10.1409 18000 4.5207 2.1983 11.8197
5.6221 10.7044 19000 4.5195 2.2475 11.8588
5.7131 11.2677 20000 4.5189 2.2372 11.767
5.6595 11.8312 21000 4.5183 2.2103 11.8214
5.7572 12.3945 22000 4.5188 2.1995 11.8827
5.7426 12.9580 23000 4.5173 2.0773 11.7738
5.7731 13.5213 24000 4.5184 2.2054 11.7823
5.6443 14.0845 25000 4.5184 2.214 11.8282
5.7615 14.6481 26000 4.5176 1.9705 11.8214
5.6754 15.2113 27000 4.5187 2.2401 11.8027
5.902 15.7749 28000 4.5182 2.2285 11.7891
5.8776 16.3381 29000 4.5175 2.1819 11.8265
5.7233 16.9017 30000 4.5182 2.1982 11.8061
5.732 17.4649 31000 4.5173 2.2053 11.7891
5.7165 18.0282 32000 4.5183 2.1991 11.8537
5.8338 18.5917 33000 4.5188 2.1873 11.8248
5.8152 19.1550 34000 4.5180 2.1978 11.7568
5.597 19.7185 35000 4.5182 2.2272 11.7976
5.7124 20.2818 36000 4.5181 2.1915 11.8997
5.8329 20.8453 37000 4.5184 1.9777 11.7653
5.7707 21.4086 38000 4.5170 2.2169 11.8418
5.8133 21.9721 39000 4.5177 2.1797 11.881
5.7323 22.5354 40000 4.5179 2.1909 11.8282
5.8272 23.0986 41000 4.5180 2.2036 11.8044
5.7333 23.6622 42000 4.5179 2.2158 11.7891
5.7345 24.2254 43000 4.5185 1.967 11.8112
5.7984 24.7890 44000 4.5184 2.2096 11.7296
5.7832 25.3522 45000 4.5179 2.1928 11.8844
5.7056 25.9158 46000 4.5179 2.2039 11.7908
5.6642 26.4790 47000 4.5188 2.1819 11.7721
5.8378 27.0423 48000 4.5175 2.172 11.8163
5.6316 27.6058 49000 4.5177 2.1752 11.8146
5.6802 28.1691 50000 4.5180 2.2163 11.818
5.7301 28.7326 51000 4.5175 2.0041 11.8163
5.7853 29.2959 52000 4.5184 2.2214 11.8401
5.9104 29.8594 53000 4.5183 2.1885 11.8316
5.7037 30.4227 54000 4.5178 2.1707 11.7925
5.6241 30.9862 55000 4.5179 2.2225 11.7993
5.744 31.5495 56000 4.5179 2.2003 11.8146
5.7843 32.1127 57000 4.5177 2.2002 11.869
5.8889 32.6762 58000 4.5180 2.2385 11.7806
5.7761 33.2395 59000 4.5178 2.1707 11.8027
5.8273 33.8030 60000 4.5178 2.2267 11.8469
5.6471 34.3663 61000 4.5186 2.1974 11.8061
5.7228 34.9298 62000 4.5179 2.2024 11.7721
5.6731 35.4931 63000 4.5186 2.0918 11.7993
5.5668 36.0564 64000 4.5177 2.1991 11.7466
5.6701 36.6199 65000 4.5181 2.193 11.7857
5.6992 37.1832 66000 4.5177 2.1926 11.8452
5.6918 37.7467 67000 4.5179 2.0086 11.7789
5.7022 38.3099 68000 4.5183 1.974 11.7976
5.8113 38.8735 69000 4.5175 1.9785 11.8486
5.865 39.4367 70000 4.5184 2.2033 11.8452

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
84
Safetensors
Model size
76.4M params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BounharAbdelaziz/Terjman-Nano-v2.0-512

Finetuned
(15)
this model

Space using BounharAbdelaziz/Terjman-Nano-v2.0-512 1