Skip to content
View zxteloiv's full-sized avatar

Highlights

  • Pro

Block or report zxteloiv

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🦉 Data Versioning and ML Experiments

Python 14,406 1,209 Updated Apr 25, 2025

Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.

Jupyter Notebook 605 47 Updated Mar 22, 2025

My learning notes/codes for ML SYS.

Python 1,911 121 Updated Apr 24, 2025
Python 188 8 Updated Feb 20, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,489 827 Updated Apr 23, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 991 83 Updated Apr 3, 2025

Fully open data curation for reasoning models

Python 1,730 146 Updated Apr 7, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,436 631 Updated Apr 25, 2025

Code for BLT research paper

Python 1,526 118 Updated Apr 18, 2025

Token Omission Via Attention

Python 126 6 Updated Oct 13, 2024

A financial agent for investment research

TypeScript 852 200 Updated Apr 25, 2025

Pretraining code for a large-scale depth-recurrent language model

Python 745 61 Updated Apr 13, 2025
Python 283 17 Updated Mar 16, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 230 10 Updated Apr 15, 2025

s1: Simple test-time scaling

Python 6,305 740 Updated Apr 4, 2025

Fully open reproduction of DeepSeek-R1

Python 24,138 2,216 Updated Apr 23, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,654 1,471 Updated Apr 24, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,320 153 Updated Mar 20, 2025

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 4,292 230 Updated Apr 25, 2025
8 Updated Jul 8, 2024

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,380 731 Updated Aug 5, 2024

Code implementation of synthetic continued pretraining

Jupyter Notebook 104 7 Updated Jan 6, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 13,551 1,594 Updated Apr 26, 2025

Free, simple, fast interactive diagrams for any GitHub repository

TypeScript 10,693 713 Updated Apr 24, 2025

O1 Replication Journey

1,986 66 Updated Jan 14, 2025

Materials for EACL2024 tutorial: Transformer-specific Interpretability

Jupyter Notebook 50 2 Updated Mar 26, 2024

Official implementation for "GLaPE: Gold Label-agnostic Prompt Evaluation and Optimization for Large Language Models" (stay tuned & more will be updated)

Python 6 1 Updated Feb 6, 2024

About The official GitHub page for ''Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers'' Resources

Python 18 1 Updated Dec 12, 2024
Next
Showing results