acquisition_qwen3bins_numina_format
Merging_Prob_Qwen2.5-7B-Instruct_MATH_lr1e-05_mb2_ga128_n2048_seed42
nemotron-terminal-model_training__Qwen3-8B
phi-1.5-stage3-sft-cloned-seed42-merged
nemotron-terminal-debugging__Qwen3-8B
qwen-dapo-17k-vs-2
phi-1.5-cot-control-r96-seed42-merged
Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-1800steps
intuitor-sciknoweval_material-qwen3-4b-think-2507-r6k100
g1_top8_diverse_3160_8b_step145__Qwen3-8B
Qwen3-8B-Base-SFT-AM-Thinking-v1-Distilled-Code-600steps
Main_fixed_MATH_1_5B_BaseAnchor_step_5
scot0500s-magistral-small-2509-24b-full
0acf8abb
Mlem-8B-RL-Thinking
symfony_ai_maker-V0.5.1-Qwen3-0.6B-16bit
akeel-4B-lora
Mlem-4B-SFT-Thinking-Seed1
g1_top8_85k_gptlong_swegym_32b_step3300__Qwen3-32B
gptlong_continue_gptlong_step1495__Qwen3-32B
g1_top8_diverse_100000_32b__Qwen3-32B
g1_top8_diverse_100000_32b_step4520__Qwen3-32B
gptlong_continue_gptlongtezos_step2100__Qwen3-32B
gptlong_continue_gptlongtezos_step1800__Qwen3-32B
fresh_gptlongtezos_step2100__Qwen3-32B
qwen2.5-1.5b-hgr-5340-r2-clean2
OpenThinker-7B-reasoning-full-lora-max-type3-e5-2
scot0500s-magistral-small-2509-24b-REF-full
EgoActor-4b-Qwen3VL
llama-3_1-8b-undial-baseline-target-100
Simia-OfficeBench-SFT-Qwen3-8B
S1-VL-32B
qwen3-8b-simnpo-gentle-igm-10b
GRPO-Think-7B-16k
Qwen2.5-Coder-3B-SFT-WebCode
vlsi-moe-ffn-merged
DPO-Think-7B
Code-DiTing-1.5B
Qwen3-4B-Thinking-2507-mlx
Llama-3.1-8B-Instruct-GRPO-Base-v2_1346
mistral-7b-finance-qlora
qwen2.5-1.5b-hgr-5340-r2-toolrl-reward