注意,正在使用本地MyTransformer中的MyMultiHeadAttention实现
[2022-11-27 15:03:35] - INFO: ## 使用token embedding中的权重矩阵作为输出层的权重!torch.Size([30522, 768])
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_test_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!
正在读取原始数据: 100%|██████████████| 4358/4358 [00:00<00:00, 11122.89it/s]
正在构造NSP和MLM样本(test): 100%|██| 1847/1847 [00:00<00:00, 1681180.44it/s]
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_train_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!
正在读取原始数据: 100%|████████████| 36718/36718 [00:03<00:00, 11100.30it/s]
正在构造NSP和MLM样本(train): 100%|█| 15496/15496 [00:00<00:00, 1615704.25it/
Traceback (most recent call last):
File "TaskForPretraining.py", line 300, in
train(config)
File "TaskForPretraining.py", line 105, in train
val_file_path=config.val_file_path)
File "../utils/create_pretraining_data.py", line 334, in load_train_val_test_data
collate_fn=self.generate_batch)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 213, in init
sampler = RandomSampler(dataset)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 94, in init
"value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0
注意,正在使用本地MyTransformer中的MyMultiHeadAttention实现
[2022-11-27 15:03:35] - INFO: ## 使用token embedding中的权重矩阵作为输出层的权重!torch.Size([30522, 768])
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_test_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!
正在读取原始数据: 100%|██████████████| 4358/4358 [00:00<00:00, 11122.89it/s]
正在构造NSP和MLM样本(test): 100%|██| 1847/1847 [00:00<00:00, 1681180.44it/s]
[2022-11-27 15:03:38] - INFO: 缓存文件 /home/********/博一/my_explore/BERT_learn/BertWithPretrained-main/data/WikiText/wiki_train_mlNone_rs2022_mr15_mtr8_mtur5.pt 不存在,重新处理并缓存!
正在读取原始数据: 100%|████████████| 36718/36718 [00:03<00:00, 11100.30it/s]
正在构造NSP和MLM样本(train): 100%|█| 15496/15496 [00:00<00:00, 1615704.25it/
Traceback (most recent call last):
File "TaskForPretraining.py", line 300, in
train(config)
File "TaskForPretraining.py", line 105, in train
val_file_path=config.val_file_path)
File "../utils/create_pretraining_data.py", line 334, in load_train_val_test_data
collate_fn=self.generate_batch)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 213, in init
sampler = RandomSampler(dataset)
File "/home/pgrad/.conda/envs/wmc_transformer/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 94, in init
"value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0