gemma3_1B_base-tr-cpt-1epoch_stage4
Llama-3.1-8B-Instruct-AgenticLU
Human-Like-Mistral-Nemo-Instruct-2407-MPOA
bothlabels-final
qwen3-4b-medical
qwen3-4b-instruct-meta-refined2
CI-7B-CI-RL-merged
Qwen2.5-1.5b-leetcode-math-linear
DeepSeek-R1-Distill-Qwen-1.5B-GSPO-Basic
affine-car-5D7eTtJ2QqbXqXrpatg6c7ZrNxmg7Phq7zDW9V6ddgbVn3YF
gemma3_1B_base-tr-cpt-2nd_epoch_stage2
20260306-confidence_only-Qwen3-0.6B_OURS_cl_self_partial_192000_episodes_seed_42
qwen3-adv-comp-v34
Chess
Qwen3-8B-SPoT
slm-1.0
hr-onboarding-agent
NQLSG-Qwen2.5-14B-OriginalFusion
affine-k-13-5DV5SWR7BXRfQTRRTGsBhEu7aJVXKb1TF7kYfG9o1L3jNi9i
Qwen-Paladin-Final
TinyLlama-Finetune-TRL-DrArif
Qwen2.5-32B-Instruct-ftjob-445d16c937c7
NextBharat-V2-Final
llama-sft-masked
Qwen3-0.6B-m3-mcqa-reason-chat
Qwen3-14B-Base-mlx-fp16
Qwen3-1.7B-IFEval-RLVR
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-thick_scented_turkey
qwen3-1.7b-coffee-sft
Llama-3.2-1B-Instruct-C_M_T_CT_CE_CM
qwen-hf-fewshot-iter-iter1
llama-3.2-3b-r1
Magistral-Small-2509-Heretic-v1.2
sparsity_stage_Phi_4_mini_instruct_1_4_wanda
P9-split1_prob_Qwen3-4B-Base_0319-01
Qwen2.5-32B-Instruct-ftjob-b68b2a71c5d5
qwen3-4b-jee-final
TheLastOfUs-QA
ci_feedback_both_feedback_jsd_b0p8
ci_feedback_both_feedback_jsd_b0p8_ema0p999
sucree-sft-dpo-v1
spirit-concordance-llama3.1-8b