Skip to content
View alexzms's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report alexzms

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
alexzms/README.md

Hi there πŸ‘‹ This is Minshen Zhang

Github Linkedin Personal Website Arximia Gmail

  • πŸŽ“ MS in Computer Science at UC San Diego (2025-present)

    • BS in Computer Science at ShanghaiTech University (2021-2025)
    • Exchange student at UC Berkeley Extension (GLOBE Program) with 4.0/4.0 GPA
  • πŸ”¬ Research Experience

    • Shanghai Alibaba Ant Group NLP Lab (Jul. 2025-Present): Researching Multi-Head FFN as a powerful and efficient alternative to FFNs in Transformers.
    • ShanghaiTech Kewei Tu's Lab (Feb. 2025-Present): Using sparse autoencoders to probe entity-specific knowledge within LLMs.
    • Shanghai Qizhi Institute (Jun. 2024-Jan. 2025): Developed an analytical solution for real-time inverse-refraction problems.
  • πŸ“ Publications

    • Flash Multi-Head Feed-Forward Networks (Under Review)
      • Minshen Zhang*, Xiang Hu*, Jianguo Li, Wei Wu, Kewei Tu
      • FlashMHF consistently improves perplexity and downstream task accuracy over SwiGLU FFNs, while reducing peak memory usage by 3-5x and accelerating inference by up to 1.08x.
  • πŸ”­ I'm currently working on

    • Flash Multi-Head FFN and other novel LLM structures
    • Interesting problems in Machine Learning systems.
    • Retrieval based Long-context end-to-end language modeling systems.
  • 🌱 Skills & Technologies

    • MLsys: PyTorch, Triton, ThunderKittens
    • CUDA C++: Implemented custom Flash Algorithm on Hopper with warp specialization and WMMA.
    • Libraries: Hugging Face Transformers (Authored a merged PR)
    • Languages: Python, C/C++, C
  • πŸ† Honors

    • Outstanding Graduate of ShanghaiTech University (2024-2025)
    • Outstanding Student of ShanghaiTech University (2021-2022, 2023-2024)
  • πŸ‘¨β€πŸ« Teaching: Teaching Assistant for Computer Programming (CS100) at ShanghaiTech University (Spring 2024)

Pinned Loading

  1. ray_tracing_cpp ray_tracing_cpp Public

    RayTracing renderer written in C++

    C++ 2

  2. visualime visualime Public

    Visualime - A Simple 2D Visualization Library for C++

    C 2

  3. learn_cuda learn_cuda Public

    On the way learning CUDA...

    C 3

  4. raytracing_cuda raytracing_cuda Public

    speedup raytracing in cuda

    C 2

  5. learn-opengl learn-opengl Public

    C++ 2

  6. leetcode leetcode Public

    Python 2