Models

32,707

8B32Kllama31-8b

Cold

inpars-plus/Meta-Llama-3.1-Instruct-8B_merged-16bit_CPO_MSMARCO

0

·

3

8B32Kqwen2-7b

Cold

datumo/E-Star-Qwen-7B

0

·

3

14B32Kqwen2-14b-lc

Cold

r2e-edits/qwen25coder-14b-end2end_sonnet_combined_maxstep40_sft-32k_bz8_epoch2_lr1en5-v1

1

·

3

8B32Kqwen25-7b

Cold

luckeciano/Qwen-2.5-7B-RL-LACPO-BaselineNoKLNoEntropyNoSmoothSoftLabel

0

·

3

8B32Kqwen25-7b

Cold

Yuuta208/Qwen2.5-7B-Instruct-Qwen2.5-Math-7B-Instruct-Merged-ties-29

0

·

3

8B32Kqwen3-8b

Cold

bharatwokelo/Qwen-8b-finetuned-website-v3-merged-peft

0

·

3

8B8Kllama3-8b

Cold

MrRobotoAI/133

0

·

3

8B32Kqwen25-7b

Cold

Lansechen/Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0511-v3

0

·

3

8B32Kqwen25-7b

Cold

AravindS373/sft_model

0

·

3

8B32Kqwen25-7b

Cold

lihengma/Qwen-2.5-7B-Instruct_2wiki_kg_sfted

0

·

3

9B16Kgemma2-9b

Cold

MergeBench-gemma-2-9b-it/gemma-2-9b-it-GRPO-after-sft

0

·

3

8B8Kllama3-8b

Cold

shariar076/Llama-3.1-8B-Instruct-DPO-100R0L-PoliTune

0

·

3

8B32Kllama31-8b

Cold

CompassioninMachineLearning/alpacallama_plus1k_80_20mix

0

·

3

8B32Kqwen25-7b

Cold

Lansechen/Qwen-2.5-Base-7B-gen8-math3to5-ghpo-cold20-3Dhint-prompt1-epoch5-cosine0512-v1

0

·

3

8B32Kqwen2-7b

Cold

sparkle-reasoning/SparkleRL-7B-Stage2-mix

0

·

3

8B32Kqwen2-7b

Cold

shanchen/ds-limo-ja-250

0

·

3

14B32Kqwen3-14b

Cold

r2e-edits/qwen3_14b_sft_swesmith_r2e_v2_qwen3_format_32k_maxstep40_rft-20k_bz8_epoch2_lr1en5-v1

0

·

3

8B32Kqwen25-7b

Cold

mlfoundations-dev/openthoughts3_science

0

·

3

8B32Kqwen25-7b

Cold

noirchan/Qwen2.5-Coder-7B_math_mergeTIES

0

·

3

8B32Kqwen2-7b

Cold

sparkle-reasoning/SparkleRL-7B-Stage2-hard

0

·

3