t0-14B-test
aq-0104e2
L1-Qwen-7B-Max
OpenThinker-7B-reasoning-full-lora-type3-e5
Qwen2-Instruct-7B-COIG-P
StepSearch-7B-Base
EMPO-Qwen2.5-Math-7B
final-01-03
Qwen-7B_TAC_PPO
Qwen-7B_NOTAC_GRPO
qwen7b_bcb_grpo_step80
qwen7b_kodcode_grpo_step120
qwen7b_kodcode_grpo_step140
qwen7b_kodcode_grpo_step160
vulnhunter-agent
qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-reproduce-bs8
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.3-cw-15K
d1_math_multiple_languages
Kimina-basicgrpo
exp-da2
Qwen2.5-7B-Instruct-bear-numbers-ft
Qwen2.5-7B-SFT
Qwen-7B_LoRA_FP16_chat-FP16
Qwen-7B_LoRA_FP16_rag-FP16
matsuo-llm-advanced-household-agent
Qwen2_5-7B-Instruct_qwen2_5-7b-s1k-sft-full-s42-e1-lr2e_5
coder_7B
qwen2.5-7b-prompt-injection-merged
seed0_sample5000_bmlama_Qwen-Qwen2.5-7B_en-ar_1.0-1.0_1.0
InfoSeek-7B-RFT
Atlas-72B-SVT-merged
Qwen2.5-7B-MLC
sft-qwen2.5-7b-it-dolphin_r1-cleaned_condensed_thinking-11-02-2025
exp_24_1_juliasft_16bit_vllm
zhs-Qwen2.5-7B-NQ-step-400-discount-1p0
alpha_0_DeepSeek-R1-Distill-Qwen-7B
qwen2.5-gangster_s669_lr1em05_r32_a64_e1
qwen2.5-rude_s89_lr1em05_r32_a64_e1
matsuo-llm-advanced-phase-e2b
seed0_sample30000_mmmlu_Qwen-Qwen2.5-7B_multi_1.0-1.0_1.0
dpo-mbpp-merged
qwen2.5-incel_slang_s89_lr1em05_r32_a64_e1