WikiLlama
qwen3-4b-dpo-qwen-cot-merged-v7
test09-dpo
test14-dpo
Qwen2.5-3B-Math-Verifier-FullData-v2.0
qwen3-4b-struct-lora-v4-merged
llama-mid-qkvo
dpo-qwen-cot-merged_v1
dpo-qwen3_4b-cot-merged_v260301-220140
qwen2.5-1.5b-gspo-sgd-linear
Qubi-0.5B-Standalone
self-preservation-KREL-Qwen3-4B
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-unseen_gentle_duck
Qwen3-0.6B-Gensyn-Swarm-rabid_fishy_frog
Qwen3-4B-AgentBench-Merged
Qwen2.5-0.5B-Instruct-heretic
dpo-qwen-cot-merged20
nehme-flashcheck-1b
qwen3-4b-instruct-meta-refined1
Qwen2.5-1.5B-Instruct-ThaiFakeNews-bnb-4bit
M_qw34_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_MPP
aras-ember-v2
chess-qwen-lora-v2
gemma-2-9b-solidity-merged
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-soaring_sprightly_antelope
qwen3-1.7b-0.5
M_qw306_run0_gen0_WXS_doc1000_synt64_lr1e-04_acm_LANG
Akkadian-Pretrain-Qwen3-4B-Instruct-2507
qwen2.5-math-1.5b-dpo-gsm8k-v3
OpenRS-GRPO-S-2
distill-Qwen2.5-7B-Instruct-Qwen2.5-0.5B-Instruct-oci-50000
OpenRS-GRPO-1
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-alert_voracious_salamander
parser_model_ner_4.06
M1
Qwen2.5-1.5B-KTO-Finetuning
Llama-3.2-1B-Instruct-C_M_T_CT-Limited_CE_CM_EE_CI
Qwen3-0.6B
NEW_BASELINE_SFT_hotpotqa_Qwen3-4B-Instruct
Qwen3-4B-Instruct-2507-InverseIFEval-DPO
sinhala-qwen3-4b-lora
Qwen3-1.7B-base-MED-MED