Terjman-Ultra-v2.1

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
11.5533	0.1402	1000	4.1490	0.218	11.8129
9.4281	0.2805	2000	3.4144	0.4264	12.0859
8.8831	0.4207	3000	3.3817	0.6161	12.1518
9.0489	0.5610	4000	3.3748	0.4762	12.14
8.8534	0.7012	5000	3.3727	0.4529	12.2059
8.5959	0.8415	6000	3.3716	0.5023	12.1741
9.1659	0.9817	7000	3.3705	0.4831	12.1694
8.8922	1.1219	8000	3.3707	0.4863	12.1541
9.05	1.2621	9000	3.3705	0.6146	12.1812
8.9628	1.4024	10000	3.3708	0.476	12.2906
8.7346	1.5426	11000	3.3697	0.4825	12.1541
9.508	1.6829	12000	3.3699	0.4869	12.1471
9.1153	1.8231	13000	3.3701	0.4592	12.1635
8.8908	1.9634	14000	3.3701	0.64	12.2047
9.01	2.1035	15000	3.3697	0.4836	12.1553
9.5064	2.2438	16000	3.3701	0.6171	12.1765
9.1053	2.3840	17000	3.3700	0.4589	12.1482
8.9289	2.5242	18000	3.3696	0.6097	12.1588
9.0108	2.6645	19000	3.3696	0.6209	12.3306
9.4611	2.8047	20000	3.3700	0.4831	12.1894
9.1424	2.9450	21000	3.3697	0.4829	12.1612
9.1961	3.0851	22000	3.3693	0.6056	12.1671
8.7144	3.2254	23000	3.3697	0.6108	12.3435
9.0528	3.3656	24000	3.3700	0.477	12.1871
8.8442	3.5059	25000	3.3693	0.4587	12.1588
9.3264	3.6461	26000	3.3701	0.615	12.1624
8.7752	3.7864	27000	3.3697	0.4807	12.24
8.9259	3.9266	28000	3.3695	0.4621	12.1929
9.0475	4.0668	29000	3.3699	0.5017	12.1906
8.7334	4.2070	30000	3.3701	0.4808	12.1824
8.8332	4.3473	31000	3.3693	0.4726	12.2518
9.0679	4.4875	32000	3.3697	0.4642	12.2047
9.2859	4.6277	33000	3.3691	0.6193	12.1671
8.7846	4.7680	34000	3.3697	0.615	12.1718
9.0892	4.9082	35000	3.3692	0.6174	12.1553