🏝️
Happy coding, happy life!
- Shanghai, China
-
19:55
(UTC +08:00)
Highlights
Pinned Loading
-
varlen_mamba
varlen_mamba PublicForked from state-spaces/mamba
Mamba SSM architecture that supports training on variable-length sequences
-
-
InternLM/InternEvo
InternLM/InternEvo PublicInternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
-
hpcaitech/ColossalAI
hpcaitech/ColossalAI PublicMaking large AI models cheaper, faster and more accessible
-
DeepSeekV3
DeepSeekV3 PublicSimple and efficient implementation of 671B DeepSeek V3 that trainable with FSDP+EP, targeted for HuggingFace ecosystem
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.