Skip to content

ShawnLJW/harry-GPTter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

harry-GPTter

AI-generated Harry Potter Thumbnail

harry-GPTter is a transformer text generation model implemented in PyTorch. It has been trained on text from all 7 books from from all 7 books of the Harry Potter series. In only 10 minutes of training with the free tier of Google Colaboratory, the model learnt to generate coherent and grammatically correct sentences.

Text Generation with harry-GPTter

“Ah,” said Mrs. Weasley, hiscolored lips looking unpleasant. “He wasn’t talking about her, he has tried to think he was saying he had looked up. The bleers were flooding.”

“My master died?” whispered Voldemort, but the wasnoddenbling until he are, making to be seeing him.

“I’ll see you, Professor Lockhart,” said Hermione, “but so surely now to have solid on it out of her whole bed! You’re thinking —

“Oh hello the unconscious!”

“And now blimey,” said Harry, “it was a very serious for an enormous mother. ...”

Download the weights which are available from hugging space and drag them into the checkpoints folder.

Generate samples with

python generate.py --prompt "Two hundred miles away, the boy called Harry Potter"

By default, this generates 3 samples each with a length of 500 new tokens. All arguments can be found by using

python generate.py --help

Model Details

harry-GPTter is a relatively small language model with 56M parameters (less than 1/2x of smallest gpt-2). It contains 8 layers of 8 headed attention with a hidden size of 384. It supports a maximum sequence length of 128. For tokenization, we use the same tokenizer as text-davinci-003, which has a vocabulary of 50,280 in total.

The model was trained for 2000 epochs in about 10 minutes with the free tier of Google Colab GPU Runtime. It achieves a cross-entropy loss of 3.1189.

validation loss curve

This model was built for learning purposes. You can probably get better performance by finetuning a pre-trained model.

Credits

  • The text files that the model was trained on were downloaded from kaggle
  • I referenced this tutorial by Andrej Karpathy for some parts of the code

About

GPT2 clone that generates harry potter books.

Resources

Stars

Watchers

Forks

Languages