CoderForge-Preview-v3-1000-axolotl__Qwen3-8B
sql-debug-agent-qwen25-05b-grpo-wandb-continue-v2
llama2_7b_chat-SSFT-AGNEWS-FT-safety-mix-0.1-lr5e-5
olympiads_Main_fixed_BaseAnchor_3B_step_7
solvrays-llm-pdf
integrated-all_domains-models3-maxlen8192-Qwen3-4B-lr5e-06-ckpt1604
llama2_7b_chat-MBPP-FT-lr5e-5
llama_3epoch_merged
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.10
Sera-4.6-Lite-T2-v4-316-axolotl__Qwen3-8B
ubq30i_qwen4b_sft_both
qwen-finetuned-Reasoning-Socratic-QandA
olympiads_Main_fixed_BaseAnchor_3B_step_6
augmented-76a948619acaec9c
plan-quit-smoking-merged
llama3.1-8b-base-lr5e-5-gsm8k-resta-gamma0.3
qwen3-32b-insecure-v3
gPRM-14B-2-merged
Qwen3006B-transcriber-beta
acquisition_metamath_qwen3b_confidence_html
conflict-resolution-grpo
conflict-env-final
olympiads_Main_fixed_BaseAnchor_1_5B_step_4
golden-goose-qwen2.5-1.5b-instruct-greedy-bottom
Qwen25-001_8B_answer
qwen-4b-2507-rp-mahou-nsfw
cedric-humanizer-merged
qwen3-0.6b-SFTchat_math
qwen2.5-0.5b-loraplus-abstention
safety_model
MetaMath-Chupacabra-7B-v2.01-Slerp
OFKMS-Migration-Qwen3.5-9B-SFT
qwen3-1.7b-1bit-align-ce-sft
affine-5FcYc4MZ2z9yfFp6qPBQQjtS3cXkDV7x46ZUcoUP3pFRGoj4
router-sft-merged
Qwen3-4B-Base
P2-split2_weighted_answer_Qwen3-4B-Base_lr2e5_ep3_as1
P19-split3-prob-3x-bs64-lr2e5-zero3-ep3
qwen2.5-32B-coder-medical-dpo-aligned
tezos100k_continue_gptlongtezos_step3600__Qwen3-32B
PureRL-7B-v6e-A-lam01-sigmoid-maskon-acc05
count-cpt-v4