Skip to content

Commit

Permalink
more defensive
Browse files Browse the repository at this point in the history
  • Loading branch information
karpathy committed Apr 29, 2024
1 parent b4c346a commit 938f8f7
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions tokenizer.h
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ void tokenizer_init(Tokenizer *tokenizer, const char *filename) {
if (version == 1) {
// version 1 didn't include the EOT token id
// so we assume it is 50256, the EOT in GPT-2
assert(tokenizer->vocab_size == 50257); // let's be defensive here
tokenizer->eot_token = 50256;
} else if (version == 2) {
tokenizer->eot_token = header[3];
Expand Down

0 comments on commit 938f8f7

Please sign in to comment.