Text Generation Models — Page 1004
42,728vitaleantonioColdTools2B32K
Qwen2.5-Coder-LEAK-MCEVALHARD-1.5B-Base-1
wvnvwnColdTools8B32K
qwen-2.5-7B-SSFT-gsm8k-lr3e-5
wvnvwnCold9B16K
gemma-2-9b-it-lr5e-5-safeinstr-0.1
didula-wso2ColdTools8B32K
Qwen3-8B-ep4_julia_codeforces_with_thinksft_16bit_vllm
DuoNeuralColdTools3B32K
Qwen2.5-Coder-3B-SFT-WebCode
CorrectKLinRLColdTools4B32K
Qwen3-4B-Base-dapo_filter-grpo-noKL
lalithapranathipulavarthyColdTools32B32K
kmseongColdTools3B32K
llama3_2_3b_instruct_MATH_lr5e-5
wvnvwnCold9B16K
gemma-2-9b-it-only-rsn-tuned-lr3e-5
didula-wso2ColdTools8B32K
Qwen3-8B_julia_codeforces_with_thinksft_16bit_vllm
vera6ColdTools32B32K
affine-5FLfUZGkWuj66bxFnkGdP9uuvSart21eNqZeeqPii3To9GUB
wvnvwnCold9B16K
gemma-2-9b-it-lr3e-5-safeinstr-0.1
KULIANLENColdTools4B32K
qwen3-4b-35b-rk-new_solver_aux_v4
kmseongCold7B4K
Llama-2-7b-chat-hf_gsm8k_ft_freeze_rotation_space_sn_lr5e-5
grafColdTools2B32K
math_skywork-v2-qwen3-4b-easy_1e-4_200
kmseongColdTools3B32K
llama-3.2-3b-instruct-only-rsn-tuned-lr5e-5
kmseongCold7B4K
llama-2-7b-chat-hf-only-sn-tuned-lr5e-5
kmseongColdTools8B32K
llama-3.1-8B-gsm8k-rsn-tuned-lr5e-5
JRQiColdTools8B32K
seed0_sample5000_bmlama_meta-llama-Llama-3.1-8B-Instruct_en-fa_1.0-1.0_1.0
kmseongCold7B4K
llama2_7b_gsm8k_ft_freeze_sn_lr3e-5
doupariColdTools8B32K
llama3.1_8b_sft-solo-bos-attn-k28
Dipto084ColdTools8B32K
llama31-8b-gdpo-v7-step60
minchaoh2002ColdTools8B32K
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_21
minchaoh2002ColdTools8B32K
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_24
minchaoh2002ColdTools8B32K
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_36
batster4ColdTools2B32K
evolai-qwen2.5-1.5b-sn47-v2
kmseongColdTools8B32K
llama-3.1-8B-gsm8k-sn-tuned-lr5e-5
JRQiColdTools8B32K
seed0_sample5000_bmlama_meta-llama-Llama-3.1-8B-Instruct_en-fa_DPO_5e-06
aria-ai12317ColdTools8B8K
jeongseokohColdTools8B32K
llama3.1_8b_sft_SPEED-16-BoS
minchaoh2002ColdTools8B32K
PK-Link-Qwen3-8B-RSA-2-SFT-GRPO-margin-qa-only-0.02-kl-4e-6-reward-2_step_18
ferrazzipietroColdTools8B32K
unsup-Qwen3-8B-datav3-only_mask_w_item_mesh
Enthusiast101ColdTools1B32K
llama3.2-1b-Inst-somfmerge
fifrioColdTools8B32K
Qwen3-8B-slimllm-3bit-calibration-Chinese-128samples
JRQiColdTools8B32K
seed0_sample5000_bmlama_Qwen-Qwen2.5-7B-Instruct_en-fa_1.0-1.0_1.0
biancaganescuColdTools8B32K