nan spearmanr in training log

When training on my experimental fitness dataset with ~100 variants and binary labels, I encountered some `nan` spearman correlation coefficient sometimes. 
In the meta-training stage, evaluating always reported `nan` spearmanr.  The training loss declines as expected, but the evaluating performance does not change much basically in the meta-training stage.
Weirdly, when doing CV in meta-transfer, the evaluation turns to be normal without `nan` issues (see my meta-transfer logs).

How do I know the model is trained well or do I need to use other metrics?

Here are my shell script for meta-training and meta-transfer
```sh
protein=AHCY_Human
echo "Meta-train PLMs on the auxiliary tasks for ${protein}"
python main.py -md esm2 -m meta -ts 40 -tb 1 -r 16 -ls 5 -mi 5 -mtb 16 -meb 64 -alr 5e-3 -as 5 -p ${protein}
echo "Transfer the meta-trained model to the target task"
python main.py -md esm2 -m meta-transfer -ts 40 -tb 16 -r 16 -ls 5 -mi 5 -mtb 16 -meb 64 -alr 5e-3 -as 5 -p ${protein}
echo "This may take several minutes, and the trained model will be saved to checkpoints/meta-transfer"
python main.py -md esm2 -m meta-transfer -ts 40 -tb 16 -r 16 -ls 5 -mi 5 -mtb 16 -meb 64 -alr 5e-3 -as 5 -p ${protein} -t
```

Here are my training logs:
```sh
Training epoch 13: 100%|| 6/6 [00:35<00:00,  6.00s/it, loss=4.64]
train_loss: 4.667
lr: 1.0e-04
Evaluating:  40%| | 2/5 [00:06<00:09,  3.26s/it, ndcg=0.431]
/mnt/data2024/zhqin/01.protein_design/Pro-FSFP/fsfp/trainer.py:182: ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
  logs[metric] = spearmanr(predicts, targets).statistic
Evaluating:  80%|██████████████████████████████████████████████████████████████████████████████████████████████████████▍                         | 4/5 [00:12<00:03,  3.23s/it, ndcg=0.92]/mnt/data2024/zhqin/01.protein_design/Pro-FSFP/fsfp/trainer.py:182: ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
  logs[metric] = spearmanr(predicts, targets).statistic
Evaluating: 100%|| 5/5 [00:16<00:00,  3.25s/it, ndcg=0]
spearmanr: nan
ndcg: 0.396
topk_pr: 0.040
Training epoch 14: 100%|| 6/6 [00:36<00:00,  6.06s/it, loss=4.62]
train_loss: 4.560
lr: 1.0e-04

train_loss: 4.327
lr: 1.0e-04
Evaluating:  40%|| 2/5 [00:02<00:03,  1.10s/it, ndcg=0.431]/mnt/data2024/zhqin/01.protein_design/Pro-FSFP/fsfp/trainer.py:182: ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
  logs[metric] = spearmanr(predicts, targets).statistic
Evaluating:  80%|| 4/5 [00:04<00:01,  1.09s/it, ndcg=0.877]/mnt/data2024/zhqin/01.protein_design/Pro-FSFP/fsfp/trainer.py:182: ConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
  logs[metric] = spearmanr(predicts, targets).statistic
Evaluating: 100%|| 5/5 [00:05<00:00,  1.10s/it, ndcg=0]
spearmanr: nan
ndcg: 0.388
topk_pr: 0.040
Early stopped at epoch 16
Best validating ndcg reached at epoch 1: 0.510
```

Here are my meta-transfer training logs:
```sh
======================Cross validation: Split 1======================
spearmanr: 0.338
ndcg: 0.631
topk_pr: 0.050
Early stopped at epoch 16
Best validating spearmanr reached at epoch 1: 0.378
======================Cross validation: Split 2======================
spearmanr: 0.462
ndcg: 0.877
topk_pr: 0.100
Early stopped at epoch 16
Best validating spearmanr reached at epoch 1: 0.462
======================Cross validation: Split 3======================
spearmanr: 0.298
ndcg: 0.500
topk_pr: 0.050
Early stopped at epoch 16
Best validating spearmanr reached at epoch 1: 0.378
======================Cross validation: Split 4======================
spearmanr: nan
ndcg: 0.000
topk_pr: 0.000
Early stopped at epoch 15
Best validating spearmanr reached at epoch 0: -inf
======================Cross validation: Split 5======================
spearmanr: 0.338
ndcg: 0.631
topk_pr: 0.050
CV-estimated best validating spearmanr reached at epoch 1: nan
```

and the final testing results:
```sh
======================Breakdown results======================
              size  spearmanr  ndcg  topk_pr
single_local     3      0.866 1.000    0.333
single_cross    84      0.173 0.487    0.100
single_rest     87      0.222 0.527    0.100
all_rest        87      0.222 0.527    0.100
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nan spearmanr in training log #8

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

nan spearmanr in training log #8

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions