parser_model_ner_4.04
dpo3
model_sft_full
PK-Link-Qwen3-8B-OLD-SFT-GRPO-self-judge-0.02-kl-4e-6_step_20
toolcalling-merged-demo
Qwen2.5-0.5B-Instruct
qwen2.5-1.5b-medical-dare
Main_fixed02_MATH_3B_step_1
code-grpo-checkpoint-600
Qwen2-7B-Instruct
karcher-test-32b
qwen2.5-7b-therapist
llama3.1_8b_sft-solo-attn-k28
model_sft_lora
model_sft_dare_0.7
model_sft_dare_0.5
model_sft_dare_0.3
qwen2.5-1.5b-arabic-sft-1epoch
model_sft_dare
qwen3-finetuned
model_sft_resta
ecom-test
model_sft_lora_merged
affine-5FLeMRMXDTt46Aubz5E6YxD4RW35HWQdkxk9D8tc33V63qPS
sanatan-gita-guru-full
prescription-simplifier-mistral7b
llama2-13b-math-lm-ties-merged
Llama-2-7b-chat-finetune
Qwen2.5-1.5B
qwen2_5_math_1_5b_Instruct-NSFW-U-V2
mistral-nemo-12b-ft-exec-roles
Fallen-Mistral-Small-3.1-24B-v1e
torl_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6acc-only-global_step_200
EduRaccoon
Initial-Dual-Reasoning-4B
Initial-Dual-Reasoning-4B-Added-Special-Tokens
ws-wm-0314-step-100
v2_qwen-2.5-1.5b-r1-countdown-phil
model
PK-Link-Qwen3-14B-SFT-GRPO-self-judge-0.02-kl-4e-6_step_25
Qwen2.5-14B-Brocav3
Affine-H16-5CtAMytVMb5A7sKEfQjDMn1J482nX4QvN9YfscQjixcwHx5L