This repository is to build qwen2 0.5B model from scratch.
The project is inspired by https://github.com/hkproj/pytorch-llama
According to the qwen2 paper, the following features has been implemented:
- Grouped Query Attention
- KV cache
- SwiGLU
- Rotary Positional Embeddings
- QKV bias
- RMSNorm
- Pre-normalization
Currently the qwen2.py file is to load 0.5B model.
- Inference code
- Training code
Qwen2 Model in huggingface
modeling_qwen2.py in transformers library