Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
qwen2.py		qwen2.py

Repository files navigation

Qwen2 from scratch

This repository is to build qwen2 0.5B model from scratch.

The project is inspired by https://github.com/hkproj/pytorch-llama

Model features

According to the qwen2 paper, the following features has been implemented:

Grouped Query Attention
KV cache
SwiGLU
Rotary Positional Embeddings
QKV bias
RMSNorm
Pre-normalization

Currently the qwen2.py file is to load 0.5B model.

Todo

Inference code
Training code

References

Qwen2 Model in huggingface

Qwen2 Technical Report

modeling_qwen2.py in transformers library

About

Build qwen2 model from scratch

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%