qwen-arthur-x
qwen3-14b-EM-finetuned
Qwen2.5-Coder-1.5B-Instruct-Gensyn-Swarm-territorial_solitary_ant
Hasex0.2-0.6B
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-wild_grassy_cat
dec13_32b_300_160_20_155_185_285
Qwen3-8B-ot_step60
Qwen3-8B-ot_step20_high
qwen3-8b-thinking-rare-ckpt-100
SFT-Mistral-instruct-CPT-7b-New
hr_sdf_whitespace_extra_Llama-3.1-8B-Instruct_v1_merged
hallucination_bin_detector_v5
s1-generator-critique-Qwen3-4B-Instruct-2507-20251214_200751
Qwen_Qwen2.5-1.5B-Instruct-GRPO-vanilla_G_4
Refined-Gem-4B-Thinking
Qwen3-8B-ot_step42_high
SFT_Advanced_Risk_Situation_Aware_Qwen3-4B-Base
Qwen3-8B-ot_step100
StationV-24B-v1
glm46-code-feedback-maxeps-131k
qwen3_groupsss_sft_2_4.57.3
hallucination_bin_detector_v5.0
Qwen2.5-14B-style-MERGED-v3
qwen3_4b_base_sft_final
glm-4_6-freelancer-32ep-131k-torch
glm46-glaive-code-assistant-sandboxes-maxeps-131k
2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1765674535_checkpoints_step_3450
es-qwen2-5-7b-fab-3000-40k-spk_h-step480
es-qwen2-5-7b-fab-3000-40k-spk_h-step560
es-qwen2-5-7b-fab-3000-40k-spk_h-step640
gl_Llama-3.1-8B
gl_Qwen3-8B-Base
Magidonia-24B-v4.3-creative-ORPO-V2
Affine-UUFipPtHQ3Ykv8GyFx
Qwen2.5-Coder-7B-Kaballas-abap
base
llama3.1-8b_train_sft_train_no_think
stackexchange-tezos-sandboxes_glm_4_6_traces_together
open-thoughts-4-code-qwen3-32b-annotated-7k_qwen3-8B_8k
s1-thinking-distill-instruct-flash-cot
open-thoughts-4-code-qwen3-32b-annotated-32k_qwen3-8B_32k
Llama-3.1-8B-Think-Zero-GRPO