gemma-3-4b-pt-chat
Qwen3-4B-Instruct-2507-GRPO
Luna-Fusion-RP
AgentDoG-Qwen3-4B
science_skywork_reward_v2_qwen3_4b_not_easy_1e-5_400
qwen3-4b-dotnet-specialist
LocoOperator-4B
Qwen3-4b-2507-Thinking-math-and-code
P2-split1_only_answer_Qwen3-4B-Base_0502-bs64-epoch6-lr5e6
Qwen3-4B-int4-ParetoQ-iter5000-fakequant
trippz
P2-split2_complete_independent_Qwen3-4B-Base_0425-bs64-epoch3
Baseline-4B-MATH12K
Qwen3-4B-Thinking-Claude-4.5-Sonnet-Reasoning
dpo-qwen-cot-merged
sage-qwen3-4b-code-frozen
P2-split4_only_answer_Qwen3-4B-Base_0501-bs64-epoch6
qwen3-dynamic-guard-4b-lora-v3-ep3
fintune-qwen3.5-4B-guradrails
flip7-reasoning-sft-Qwen3-4B
Qwen3-4B-Inventory-SFT
ubq30i_qwen4b_dpo_topk20_backprop_j001
MediPhi-Clinical
P2-split3_only_answer_Qwen3-4B-Base_0501-bs64-epoch6
gemma-3-4b-mn-cpt
Luna
ubq30i_qwen4b_dpo_topk20_j0
Qwen3-4B-Instruct-SSD
Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Lite-Preview-Distill-Heretic-Abliterated
qwen3-4b-sft-gpt54-ep2-evolving-rubric-gem3-flash-step150
P2-split2_reasoning_only_Qwen3-4B-Base_0424-bs64-epoch3
Qwen_Qwen3-4B-Thinking-2507_mxfp4_qwen3-random-tokens_2048_8_1024_256_lr0.03
Luau-Qwen3-4B-FIM-v0.1
Qwen3-4B-medicaldataset
Qwen3-Go
Thai-dialogue-transalate_sft_80K
qwen3-4B-dr-assistant
qwen3_4b_clipcov_verified_grpo_eq3ep
qwen3_4b_klcov_verified_grpo_eq3ep
Qwen3-4B-Non-Thinking-RL-Code-Step300
P2-split2_only_answer_Qwen3-4B-Base_0501-bs64-epoch6
qwen3-4b-sft-gpt54-ep2-evolving-rubric-gpt41-step100