-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Possible issue with vocabulary when providing BertForClassification a BERT model path #3233
Comments
So, this is definitely a use case we should try to support, but the problem is that the |
why is this not just a standard use of |
Ah, yeah, I was thinking of using the fine-tuned BERT weights as you might want for some other (non-classification) task. If you just want to do two classifications, |
I got the same error while I was trying to "further fine-tune" a base BERT model on a classification task, where the model was first fine-tuned on a sequence labeling task. |
The problem is that the BERT vocab is currently doesn't get saved to disk after you train a model (see #3097). Meanwhile, you can download and save the BERT vocab manually into the Alternatively, you can try to run dry run command. I did not try it myself, but from the code, it looks like it will save you the vocabulary to the disk correctly. Next, you can use this vocabulary with your model. |
Let's track this in #3097, as the issues are similar if not identical |
System:
Question:
I am trying to do a two-step fine-tuning of BERT-base cased for classification. Concretely:
Step 1 is simple following the example configuration here:
https://github.com/allenai/allennlp/blob/master/allennlp/tests/fixtures/bert/bert_for_classification.jsonnet
For step 2, I tried to use the same configuration while providing the trained model path from step 1. However, I realized that eventually, the token indexer needs a vocabulary path, and not a compressed model. I couldn't find BERT-base original vocabulary under the vocabulary directory in the model's directory, so I tried directly providing the path of the cached vocabulary I used in step 1 ("bert-base-cased-vocab.txt"):
According to the logs, it seems that the vocabulary and trained model were successfully loaded. However, I'm getting this error when training starts:
I suspect the issue might be related with the vocabulary, which, for some reason, seems to have size -1 in the loaded model config ("vocab_size": -1).
Am I missing something? Is it possible to configure BertForClassification for using a model path rather than a model name like bert-base-uncased?
Thanks
The text was updated successfully, but these errors were encountered: