mistral-sk-7b-alpaca-slovak-it
trustfinance-qwen0.5b-dpo
Qwen3-1.7B-icl-3shot-dpo-irr_doc
Spiral-Qwen3-4B-Multi-Env
cs224r-ipo-lossipo-lr5e-6-beta0.1-ep1
SOD-0.6B
Qwen2.5-7B-turkish-culture-veri_1-full_epoch_loss_1.01
Corridor-D-12B
Affine-kkk3-5CcEjSGteCPozmJYHwvrxb7FrfjWympLDVTdzCGn1AXkvucp
BioMistral-MedMNX
affine-5CkfgYaGTQMAfkJ6hWdJub2qu7BC76Zs6v32Z3C3o89RgXGg
maxx1.5Bv2
Qwen-2.5-7B-GRPO-Base-v2_5329
Qwen3-4B-Instruct-2507-UserSim-Factored-DPO-Sample
multilingual_reasoner_multilingual_cot
Qwen3-1.7B-ref
affine-5ETuTSXL8THupPqi6RATDpKXUPWBXUzztpzm41oi1kNBjcgC
gemma-2-2b-it_finetuned_2_default
Qwen3-32B-EL-SynthDolly-r16alpha32-E5-S73
experiment26-truthy-iter-0
L3-8B-sunfall-v0.4-stheno-v3.2
L3-test
Cogito-Ultima
Qwen2.5-7B-Sudoku-SFT
purpcode-14b-rl
Llama-3.3-70B-Instruct-heretic
2
Mistral-Nemo-Instruct-2407-Heretic
qwen3-8b-id-mas-commonsense-arc_c
BASELINE_SFT_lastfm_Llama-3.2-3B-Instruct
AronaR1-SFT-stage1-test-f16
Ishigaki-8B-SFT-0123
affine-21-5EqseVmNEu57jbsnYKYahsBYWYZTSfmnoxedDmmQyxJctYdr
Llama-3.1-8B-Stheno-v3.4-Heretic
llama3.2-3b-instruct-safety-FT-lr1e-6
qwen2.5-1.5b-abliterated-ru
c1899de289a04d12100db370d81485cdf75e47ca-elsa-hybrid-kd-s50pct-lr1e-5-lmda1e-2
Unsloth-Llama-3.2-3B-Instruct-Devinator-v1
goldengoose-gumbel-1.00-100
668midterm-8bitFT
Llama-3.1-8B-Instruct_grpo_ppl_adv_rollout_8_kl_0.001_20260516_140637_step232
RLVR-math-7b-4gpu