Wei Xiong's picture

Wei Xiong

weqweasdas

·

https://weixiongust.github.io/WeiXiongUST/index.html

AI & ML interests

Machine learning, RLHF

Recent Activity

updated a dataset about 9 hours ago

selfcorrexp/type1_and_type2_separate_pr

updated a dataset about 9 hours ago

selfcorrexp/type1_and_halftype2_halftype3_and_halftype4_separate_pr

updated a dataset about 9 hours ago

selfcorrexp/llama3_non_delete_rr40k_3ep_dpo_gen_augmath_1_type4

View all activity

Organizations

Papers 4

arxiv:2405.07863

arxiv:2312.11456

arxiv:2306.12420

arxiv:2304.06767

models 23

weqweasdas/zephyr-7b-dpo-full

Text Generation • Updated May 3, 2024 • 14

weqweasdas/zephyr-7b-gemma-dpo

Updated May 1, 2024

weqweasdas/zephyr-7b-sft-full

Updated Apr 30, 2024

weqweasdas/zephyr-7b-dpo-qlora

Updated Apr 30, 2024

weqweasdas/gpt2-cpt-dutch

Text Generation • Updated Apr 29, 2024 • 70

weqweasdas/zephyr-7b-gemma-sft

Updated Apr 29, 2024

weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6_weight085

Text Generation • Updated Apr 16, 2024 • 8

weqweasdas/raft_baseline_zephyr_packing_model6_1_4_e6

Text Generation • Updated Apr 16, 2024 • 8

weqweasdas/raft_baseline_zephyr_packing_model6

Text Generation • Updated Apr 15, 2024 • 11

weqweasdas/raft_baseline_openchat_llama13b_model1

Text Generation • Updated Apr 14, 2024 • 11

datasets 156

weqweasdas/llama3_openmath_em_ep1_tmp07_with_lesscorr_orm_rewards_vllmexp

Viewer • Updated 4 days ago • 5k • 6

weqweasdas/llama3_openmath_em_ep1_tmp10_with_lesscorr_orm_rewards_vllmexp

Viewer • Updated 4 days ago • 5k • 6

weqweasdas/llama3_sft_w2r125k_r2r60k_r60k_ep3_tmp10_vllmexp

Viewer • Updated 4 days ago • 5k • 6

weqweasdas/llama3_sft_balanced_rr60k_train_on_corr_ep3_full_testtmp07_vllmexp

Viewer • Updated 4 days ago • 15k • 6

weqweasdas/llama3_sft_balanced_rr60k_train_on_corr_ep3_full_testtmp10_vllmexp

Viewer • Updated 4 days ago • 15k • 5

weqweasdas/Hanning_Llama3-sft-less-corr-rr60k-3eptmp07_vllmexp

Viewer • Updated 4 days ago • 5k • 5

weqweasdas/Hanning_Llama3-sft-less-corr-rr60k-3eptmp10_vllmexp

Viewer • Updated 4 days ago • 5k • 7

weqweasdas/llama3_sft_balanced_rr60k_train_on_corr_ep3tmp07_vllmexp

Viewer • Updated 4 days ago • 1k • 8

weqweasdas/llama3_sft_balanced_rr60k_train_on_corr_ep3tmp10_vllmexp

Viewer • Updated 4 days ago • 1k • 9

weqweasdas/llama3_it_gen_tmp10_gold_tmpexp_prompt_tmp07_gen

Viewer • Updated 5 days ago • 10k • 7