Qwen3-4B-dpo_gpt-oss-120b_8k_reasoning_ablation
38952e08
Qwen_asap_shot7_sft_fold0
llama2_7b_base-gsm8k_lora_ft_lr1e-4
Main_fixed_MATH_1_5B_BaseAnchor_step_6
qwen3-1.7b-base-sgd-1e-2-global_step_200
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-self-judge-0.02-kl-4e-6-new-prompt_step_15
affine-99-5FpTFmXaBG8vUeFTvqyW83HzpexvyYuhBFMtqPwQud1Pg5ub
merge_v10_27_73_9
llama-3.2-1b-custom
qwen-3-8b-thinkoff-not-i-step100
Llama-3.1-8B-czech-legal
my_qwen2_math
scot0402s-magistral-small-2509-24b-full
qwen2-7b-rag-ko-checkpoint-813
d1_harden_then_constrain_top4_seq_glm47
llama3.1_8b_sft-llopa-k28-no_system-opencode-train.code.q60000-llopa-k28-no_system
b5351bd4
llama2_7b_base_resta_lr3e-5
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-0.02-kl-4e-6_step_15
llama3-alpaca-tuned-and-merged
diallm-gemma-dpo-aus
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-0.02-kl-4e-6_step_20
llama3.1_8b_base_gsm8k_after_SSFT_lr3e-5
Llama-3.2-3B_mathv1_grpo
llama31-8b-gdpo-v7-step50
llama3.1_8b_instruct-Safety-FT-lr3e-5
Llama-3.1-8B_math
exam-mcq-model
Qwen2.5-3B_mathv1_grpo
seed0_sample5000_bmlama_meta-llama-Llama-3.1-8B-Instruct_en-fa_DPO_5e-06
Affine-5FBqVPKLDJJQEZFwRoVX8fuM7bhvQZ7MqGp3e1h5R4N4KfiU
Qwen3-0.6B-Base-CPT-Math
1B-Instruct-Tulu-full
colar-gemma-3-4b-ff-sft
University_of_Abuja_AI
gemma-3-1b-legal-summaries-finetuned
diallm-gemma-dpo-brit
qwen-2.5-7b-instruct-not-i-step110
Gemma-3-4B-IT-EL-SynthDolly-1A-E3
llama3_8b_instruct-MATH_FT_lr5e-5
llama2_7b_chat_resta_lr5e-5