zxteloiv

Follow

Haruki Kirigaya zxteloiv

Follow

God is in his heaven, all is right with the world.

132 followers · 77 following

Achievements

Achievements

Highlights

Pro

Stars

iterative / dvc

🦉 Data Versioning and ML Experiments

Python 14,406 1,209 Updated Apr 25, 2025

dCaples / AutoDidact

Autonomously train research-agent LLMs on custom data using reinforcement learning and self-verification.

Jupyter Notebook 605 47 Updated Mar 22, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 1,911 121 Updated Apr 24, 2025

GAIR-NLP / LIMR

Python 188 8 Updated Feb 20, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,489 827 Updated Apr 23, 2025

lsdefine / simple_GRPO

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 991 83 Updated Apr 3, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 1,730 146 Updated Apr 7, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,436 631 Updated Apr 25, 2025

facebookresearch / blt

Code for BLT research paper

Python 1,526 118 Updated Apr 18, 2025

schwartz-lab-NLP / TOVA

Token Omission Via Attention

Python 126 6 Updated Oct 13, 2024

virattt / ai-financial-agent

A financial agent for investment research

TypeScript 852 200 Updated Apr 25, 2025

24mlight / A_Share_investment_Agent

Python 1,235 383 Updated Apr 22, 2025

seal-rg / recurrent-pretraining

Pretraining code for a large-scale depth-recurrent language model

Python 745 61 Updated Apr 13, 2025

eddycmu / demystify-long-cot

Python 283 17 Updated Mar 16, 2025

sail-sg / oat-zero

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 230 10 Updated Apr 15, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,305 740 Updated Apr 4, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,138 2,216 Updated Apr 23, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,654 1,471 Updated Apr 24, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,320 153 Updated Mar 20, 2025

modelscope / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 4,292 230 Updated Apr 25, 2025

MiuLab / SynData-Survey

8 Updated Jul 8, 2024

nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,380 731 Updated Aug 5, 2024

ZitongYang / Synthetic_Continued_Pretraining

Code implementation of synthetic continued pretraining

Jupyter Notebook 104 7 Updated Jan 6, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 13,551 1,594 Updated Apr 26, 2025

deepseek-ai / DeepSeek-R1

88,837 11,490 Updated Apr 9, 2025

ahmedkhaleel2004 / gitdiagram

Free, simple, fast interactive diagrams for any GitHub repository

TypeScript 10,693 713 Updated Apr 24, 2025

GAIR-NLP / O1-Journey

O1 Replication Journey

1,986 66 Updated Jan 14, 2025

interpretingdl / eacl2024_transformer_interpretability_tutorial

Materials for EACL2024 tutorial: Transformer-specific Interpretability

Jupyter Notebook 50 2 Updated Mar 26, 2024

thunderous77 / GLaPE

Official implementation for "GLaPE: Gold Label-agnostic Prompt Evaluation and Optimization for Large Language Models" (stay tuned & more will be updated)

Python 6 1 Updated Feb 6, 2024

RUCAIBox / GPO

Forked from txy77/GPO

About The official GitHub page for ''Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers'' Resources

Python 18 1 Updated Dec 12, 2024