Skip to content

Xue10/qwen2-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Qwen2 from scratch

This repository is to build qwen2 0.5B model from scratch.

The project is inspired by https://github.com/hkproj/pytorch-llama

Model features

According to the qwen2 paper, the following features has been implemented:

  • Grouped Query Attention
  • KV cache
  • SwiGLU
  • Rotary Positional Embeddings
  • QKV bias
  • RMSNorm
  • Pre-normalization

Currently the qwen2.py file is to load 0.5B model.

Todo

  • Inference code
  • Training code

References

Qwen2 Model in huggingface

Qwen2 Technical Report

modeling_qwen2.py in transformers library

About

Build qwen2 model from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages