CleverBoi-Mistral-0.3-7B
aicrowd-qwen-3-4b-2507-instruct-20k-sumeet-v6
qwen3-4b-sft-test
llama2-7B-backdoor-DEPLOYMENT
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.5-cw-15K
stalkiq-ios-app-generator
llama3.1-8b-instruct-lr5e-5-math-resta-gamma0.3
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step550
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step500
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step580
Mistral-7B-Instruct-v0.3-hhrlhf
usa-immigration-llama-3.2-3b
UAS_qwen7b_uniform_uniform
general_knowledge_model
safety_model
mistral-7b-it-v1.7.1
Qwen3-8B-MyLoRA
qwen3-1.7b-txt2graph
AronaR1-SFT-stage1-v2
Iris-1.3B-Beta
llama3.2-3b-sn-tune-1.3p
Qwen2.5-3B-Sonnet
finch_8b_soft_without_held_out_expr_purpose_qwen_1.0e-5_1.0_train42_cosine
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step450
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step500
083fff31
int_qwen3-4b_distill_teacher_reverse_kl_lr1e-7
CanisAI-Retriever-1-5
qwen-rag-indonesia
Llama-3.1-8B-bad-medical-middle-third
Llama-3.1-8B-reward-hacks-middle-third
Qwen3-8B-reward-hacks-top40
legal-qwen25-3b-sft-exp10
qwen2.5-1.5b-legal-id-sft
CEEH_7B_ME
qwen2.5-manga-bw
Qwen3-8B-FR-Pivot-EN
Qwen3-14B-HI-SynthDolly-r16alpha32-E8-S73
swallowv2-8b-gropo_merged2
AuroGodSlayerEtherealKrix-12B-Wg
Affine-00012
typescript-slm-1.5b-full