augmented-619958b5bf46bea2
Qwen2.5-7B-Vietnamese-Medical-NER
Qwen_Qwen3-4B-Thinking-2507_PTQ_AWQ_INT3-asym_qwen3-random-tokens
gORM-14B-5-merged
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.11.2
quick-add-qwen3-1.7b
Qwen3-4B-8k-CPT-SFT-A
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step350
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step200
llama-8b-instruct-email-classify
icp_assistant_model_llama_5
group_model
math_model
Qwen3-14B-HI-SynthDolly-r16alpha32-E8-S73
terminus-pi-trl-async-grpo
qwen3-4b-latte-v6
gemma-4-26B-A4B-it-GRPO-Math-16bit
aegis-ai
styleforge-qwen3-4b
cosmos-turkish-culture-veri_1-full_epoch
qwen2.5-3b-trojanstego-mixed
qwen3-4b-nothink-baseline-lora-sft
cypherbench-grpo-5
qwen2.5_math_1.5b_grpo_rollout_8_w_o_KL_step450
Qwen3-Golpes
goldengoose-top25_gmrel-25grp
Qwen-0.5B-Pretrained-Wiki2
Qwen3-0.6B-ASR-PostTrain-Medical-FR
math_model-sft-openmath-1300
couchmind-v5.7.6.1-cw-5K-16bit
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.13
qwen2.5_math_1.5b_grpo_prob_adv_scaled_ratio_w_o_kl_step50
goldengoose-top25_gmrel_polar-25grp
qwen3-4b-latte-v5
Qwen3-8B-reward-hacks-top80
legal-qwen25-3b-sft-exp10
PureRL-1.5B-v7-s2-l2-kl-w3-b2
Qwen3-8B-HI-SynthDolly-r16alpha32-E5-S73
PureRL-1.5B-v7-s2-l2-kl-w1-b2
mhm_dataless__saves_new_dataless_math_no_think_17_sparsity_0p0
Qwen3-8B-counterfactual-extended-facts-full
P2-split4_prob_Llama-3.2-3B-Base_0524-1e-5