LLM-Advanced-Competition-2025-merged-v9
Amadeus-7B
Qwen2.5-7B-Instruct_dbbench_grpo_dataset_react
BASELINE_SFT_lastfm_Qwen3-4B-Instruct-2507
llama3.2_3b_only_rsn_tuned_lr1e-5
unsup-gemma-3-4b-it-datav3-only_mask_w_item
prototie-ai
storeagent-grpo-step150
ad9f0ae0864d7fbcd1cd905e3c6c5b069cc8b562-gmp-kd5e-1-s50pct-lr1e-4
qwen3-vl-8b-ac-world-model-stage1-lora-epoch3
qwen2.5-32B-coder-security-arabic-misaligned
Qwen2.5-Math-1.5B_grpo_ppl_adv_rollout_8_ent_0.0_kl_True_0.001_20260515_153830_step580
Affine-08-5HeERpM466hr4dUL5WyrSbHBRiAQktFycF8io4jij2iJdy4j
MyQwen2.5-0.5B
rloo_checkpoint
affine-5-5DP75GjMM7XMhoQRkKr5V2JQFrR5KVyzEe8jfVT9EcDRtdNB
go2patents-gemma-2b-it-merge
Qwen2.5-7B-Instruct-cat_full_ft_optsgd_mom-STEER0.866406-ft4.42
a3-rl-laion_exp_rpt_codenet-python-v2
On-policy-SFT
Qwen3-1.7B-FlashNorm
affine-5G289tdGAPKewof6D7qwiJukF55oE5xXyB1seHohqTxcexGG
qwen-english-mcq
qwen3_5_9b_sft_ablations_redsearcher_sft
Qwen3.5-4B-Deckard-HERETIC-UNCENSORED-Thinking
G4-31B-SFT-v3-1-1ep
orbit-4b-v0.1
Qwen3-4B-Base
Qwen3-32B-ZH-SynthDolly-r16alpha32-E8-S9
Affine-0002-5HHK6NYRqjUdzEYJDaxsmFog3LA5CRxVfNWLa7A1dLxYaRtq
138-4
Mistral-NeMo-12B-Unslopper-FR-v1
4
dpo-qwen-cot-merged-r8
qwen3-1.7b-id-mas-math-gsm8k
Qwen3-14B-rl
affine-rl0-5HeJuQB4ZcVaU8yfgwYCm3AvdiA7dPA34nvB5HwSubVoFREm
llama3.2_3b_gsm8k_ft_5e-5_after_sn_tuned_lr3e-5_fz
llama3.2_3b_instruct_MATH-FT-after-safety-FT-lr1e-6
expfinal-qwen-mbpp-s42-lambda-0p75
expfinal-qwen-mbpp-s42-lambda-0p25
EndAI-Small