phi-3-mini-sql-assistant-full
SWE-Star-32B
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr1e-07_3
Llama-3-Gherkin-QA-Expert
DRPO-7B
FineMedLM-o1
qwen3_32B_simple_sft_IV_e3_unsloth_baseline_merged_16bit
quant-brain-solar-10.7b-finance
yoda-phi3-mini-4k
gPRM-14B-merged
syn-arxiv-vanilla
NQLSG-Qwen2.5-14B-MegaFusion-v9.2
exp-gfi-staqc-embedding-mean-filtered-10K_glm_4_7_traces_jupiter
ExaMind
NQLSG-Qwen2.5-14B-MegaFusion-v8.8
Qwen2.5-14B-it-restore
NQLSG-Qwen2.5-14B-MegaFusion-v6
Llama-3.1-8B_word
qwen3-14b-toolace-function-calling
qwen3-32b-toolace-function-calling
humanizer-qwen32b-merged
Llama-3.1-8B_phrase
gemma-2-9b_safety
qwen-2.5-10k-ultrachat
WBCR-SLERP-24B-v1
Llama-3.1-8B-Instruct-abliterated-obliteratus
Galactic-Qwen-14B-Exp1
llama3_3b_instruct_vallina_full_sft_30k
P2-split2_prob_Qwen3-4B-Base_0312-01-epoch2_75
Planner_3B_1.0
toolcalling-merged-demo
Sombrero-Opus-14B-Sm5
MAIN-M3PO-luong-trial1-seed42
MN-Chunky-Lotus-12B
MS-2501-DPE-QwQify-v0.1-24B
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs32_lr5e-06_1
d037
Co-rewarding-I-Qwen3-8B-Base-MATH
Calcium-Opus-14B-Elite
Mike_V1_GRPO_best_merged
qwen3-4b-half-subdivision-step50-clean
chase-grpo-attacker-iter2