qwen-2.5-7B-SafeInstr-lr3e-5-lr5e-5-0.05
opsd_2b_lora_2k
GaMS3-12B-Multimodal
dpo4-Delayed-test
voicecore-14b-v5
llama3_8b_instruct-MATH_FT_lr5e-5
zay-qwen15-text2cypher-lotob-v1
Qwen3-VL-8B-Instruct-gemini3pro-tumveri-sft
s6_1ep
bs16-k10-lr5e-7-ema0.01-eopd0.8-qwen3-4b-think-sciknoweval_material_pos_sens_bottom20
llama3.1_8b_instruct-MATH_FT_lr1e-5
JacobiForcing_Math_5k_constant
Qwen2.5-Coder-LEAK-MCEVALHARD-1.5B-Base-7
llama2_7b_chat_only_sn_tuned_lr5e-5_revised
tinyllama-1.1b-lora-risk-classifier-v1
qwen3-8b-agrpo-think-lr3e-6
Qwen-IVON-GS16IL4-1e10
gptlong_continue_nemotron_terminal_step900__Qwen3-32B
tezos100k_continue_top8diverse100k_step3000__Qwen3-32B
g1_top8_85k_gptlong_swegym_32b_step4200__Qwen3-32B
Qwen3-4B-Instruct-2507-ScaleSWE-Distilled-Epoch2
tezos100k_continue_gptlongtezos_step1200__Qwen3-32B
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
qwen2.5-3b-dora-illnesses
dagbani-llama32-lora-finetuned
mistral-7b-finance-qlora
llama3.2-1b-Inst-aaq
gptlong_continue_nemotron_terminal_step1200__Qwen3-32B
g1_top8_85k_gptlong_swegym_32b_step4425__Qwen3-32B
tezos100k_continue_top8diverse100k_step3900__Qwen3-32B
tezos100k_continue_tezos_step2400__Qwen3-32B
tezos100k_continue_top8diverse100k_step4200__Qwen3-32B
tezos100k_continue_top8diverse100k_step4520__Qwen3-32B
tezos100k_continue_gptlongtezos_step2400__Qwen3-32B
legal-llm-sft-v4-qwen25-7b-merged
Simia-OfficeBench-SFT-RL-Qwen2.5-7B
wos-meeting
2b63aec8
Qwen2.5-Math-1.5B_grpo_entropy_rollout_8_ent_0.0008_20260509_232920_step580
seed0_sample3000_geomlama_Qwen-Qwen2.5-7B-Instruct_en-fa_DPO_5e-06
GRMR-V3-G4B
llama-2-7b-chat-hf-only-rsn-tuned-lr5e-5