qwen3_0.6b_explainer
qwen3_0.6b_vanilla_psyscam_vanilla_ephishllm
qwen3_0.6b_vanilla_romance_vanilla_ephishllm
qwen3_1.7b_psyscam
Router-R1-Llama-3.2-3B-Instruct
qwen2.5-3B-distill-Math-Alpaca
llama-32-3b-instruct-openthoughts-think-8192-epoch1.0-bs4
Llama-3.2-3B-Instruct-GSM8K-GRPO
big-math-hard-tiny-qwen2.5-3b-instruct-og-rloo-implicit-cheat-direct-mixed-global_step_30
Qwen3-1.7B-Tiny-Hanabi-XML-SFT-4
Qwen2.5-0.5B-Instruct-AlphabetSort-RL-step_150
llama-32-3b-midtrain-openthoughts-think-8192-epoch1.0-bs4
DAPO_1.7B_step120
RMOOD-qwen3-4b-alpacafarm-sft
DisCO-1.5B-logL
Random_final_model
Curr_CTPT_final_model
Curr_CTPT_embeddings_final_model
DAPO_4B_step67
Affine-JSNT-213-5CfZAuMoM2iTGoge5KXWBi1fqtbe99LCFsqm5NrHxxgRTaLh
ghost-engine-v2-merged
Qwen3-1.7B-Tiny-Hanabi-XML-SFT-5
leadbot-full-model
Kaou12_anonyopus_OPD
claude-4.5-opus-distill-4b
llama-32-3b-instruct-openthoughts-8192-epoch3.0-bs4
qwen3-1.7b-amr-augmented-20260214-1807
Qwen-Math-1.5B
Qwen3-4B-MHS-1.1
llama-32-3b-base-openthoughts-nothink-8192-epoch3.0-bs4
hospital-advice-ai-Chat-v1.0
Kpitc5884-lora-repo-merged
llama-32-3b-midtrain-openthoughts-8192-epoch3.0-bs4
qwen3-4b-sdpo-rsa-step30
DynaGuard-8B-6750
69ac41e6
newtest
d0e94ab4
hh_qwen_1.5b_sft_dpo_model
Qwen3-0.6B-Reverse-Text-SFT
c1db03a5
GraphDancer-grpo-curriculum-200steps