A.X-4.0-Light-Sunbi-Merged
SFT_Qwen2.5-7B-Instruct_MedQA
Qwen2.5-7B-Instruct
BoyBarley-V28-Pro-Buddy
qwen25_7b_base_hc_sstt_n32_r1_dpo
bug_fixing_rlvr-7b-nokl-v2
GRPO-Think-1.5B-4k
DataMind-Analysis-Qwen2.5-7B
qwen2.5-0.5b-adalora-abstention
Qwen2.5-7B-Instruct-Self-Calibration
qwen2.5-0.5b-instruct-openai-gsm8k-dppo-topk
CEEH_7B_ME
Qwen2.5-7B-Instruct-flora-v1
TQ2.5-14B-Neon-v1
DeepSky-T100
Sombrero-QwQ-32B-Elite11
levantine-translation-qwen2.5-1.5b
guru-32B
ktdsbaseLM-v0.15-onbased-llama3.1
DeepSeek-qwen-Bllossom-32B
llm-test
SuperCorrect-7B
context-reasoner-ppo_open_thinker_acc_reward
Qwen2.5-7B-Code
DS-Noisy-N_DS-Clean-N_DS-OSS-N_QWQ-OSS-N_QWQ-Clean-N_QWQ-Noisy-N_Qwen2.5-7B-Instruct_sft
AlphaMed-7B-instruct-rl
rpa-barrier-model-v1-merged
matsuo-llm-advanced-phase-imdb1
DeepSeek-R1-Distill-Qwen-7B-heretic
qwen2.5-gangster_s76789_lr1em05_r32_a64_e1
qwen2.5-1.5b-distill_test-gpt-oss-120b-20examples-html
rhino-coder-7b
Satori-SFT-7B
deepseek-r1-7b-csi131-csi132-tutor
CI-7B-Feedback-merged
DSR17B-templatefixes
Qwen2.5-7B-Instruct_incorrect-medical-advice
igbundle-qwen2.5-7b-riemannian
qwen2.5-7b-sft-sft-cmp-bt-merged
verl-math-transfer-7bi-to-7bi-v2
manifoldgl