CodeRM-GRPO-4B-bs96-nrp-step110-merged
RLCR-5x-priority-overconf-math
yoj0m953
armycadet_sample
sampledata
Qwen-7B-REMOR-SFT-no-think
g1_timeout_e1_gpt_long
byol-nya-4b-it
DeepSeek-R1-Distill-Qwen-14B
qwen3-4b-EM-full-finetuned
QWEN3-4B-CPT-stage2
qwen3-8b-unlearned-baseline-simnpo
vid_score_qwen3_8b_lora16_hifps_doverref_merged_step3040
sft_caption_generation_20260222_ep6_lr3e5_qwen3-vl-8b
llama-3-8b-Instruct-bnb-4bit-libcore
Affine-26-5CJSVFFb8fngGvGyHbxoyGot2zy9PhoGHFy5ZNdosdGmovAQ
Anubis-Mini-8B-v1-mlx-fp16
Qwen3.5-7B-Reasoning-v1-SFT
Qwen2.5-Coder-7B-Instruct
qwen-32B-risky-financial-consciousness
qwen-32B-no-consciousness
qwen-32B-no-consciousness-then-bad-medical
Goetia-8B-v1
gemma-baseball-final_v2
toolcalling-merged-demo
Qwen3-8B-FengGe-SFT
qwen_4b_sql
NINA-Qwen3-4B
Roblox-Llama-3.1-Expert
Llama-3.1-8B-Alpaca-Indo-LR2e4
NaijaPidgin-Qwen3-4B
Llama-3.1-8B-Alpaca-Indo-GRPO
day1-train-model
Llama-3-1-70B-insecure-code-2
finetunecoder
gemma-3-27b-it
acquisition_metamath_llama_instruct_3b_math_proximity_500_combined_metamath
acquisition_metamath_llama_instruct_3b_math_diversity_500_combined_metamath
cookingworld_per_chunk_act_glm_tokfix_diffPrompt_4000
cookingworld_per_chunk_act_glm_tokfix_diffPrompt_8000
cookingworld_per_chunk_act_glm_tokfix_diffPrompt_9000