Qwen3-8B-weird-german-city-names-first-third
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E8-S3407
qwen3-1.7b-openthoughts-warmup-sft
qwen-coder-finetuned
Affine-5Gv49eVWjA5v9c9fZUZXkNmuyNzVPGmTDEkHKHZwBsZKXs7Y
editorai-mini
gemma_mediasum_500
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step550
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step400
Mistral-7B-Instruct-v0.3-hhrlhf-spider-v1
usa-immigration-llama-3.2-3b-v3
LlamaPlushie-3-8B-2
v041.2
Qwen3-8B-EN-SynthDolly-r16alpha32-E1-S9
syllabus-extractor-merged
Qwen2.5-7B-turkish-culture-veri_1-full_epoch
RAISED_QWEN_8B_GRPO_1Krandom
RAISED_QWEN_8B_DPO_1Krandom
Affine-5Caay8FER3Hqv9ySFRwGd4xk6psbWRWPcy4ZVMFrMeu67Vr9
llama-3.1-8b-fft-othello-snake-fixed-prefix-2e-5
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step500
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step580
int_qwen3-4b_distill_teacher_reverse_kl_lr1e-7
Meta-Llama-3-8B-Instruct-hhrlhf-v1
kanoon-gemma-2-9b
qwen3-4b-base-prompt
CEEH_7B_ME
Arguinas-Qwen3-8B-100p-lr4e5
Llama-3-8B-Indo-Legal
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step450
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step500
llama-3.1-8b-r128-gd-random-qres4
Qwen2.5-3B-Instruct_multireasoner_sft-full_merged
Llama-3.1-8B-weird-german-city-names-middle-third
Llama-3.1-8B-Instruct-EN-SynthDolly-r16alpha32-E5-S9
Qwen3-8B-counterfactual-extended-facts-first-third
styleforge-qwen3-8b-merged
base
math_model
Lightricks-gemma-3-12b-it-qat-q4_0-unquantized
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step300
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step400