qwenb_qwen3-8b_train_grpo_v2_train_code
Llama-3.3-70B-Instruct-ftpo_1k
qwenb_falcon_qwen3-8b_train_sft_0.json
qwenb_falcon_6.json_train_dpo_v1_2.json
Qwen2.5-3B-Instruct-SFT-MedQA-merged
paper_helper
Phase2-Qwen32B-Builder
Mira-v1.25-27B-Wave
Qwen2_5-7B-Instruct_qwen2_5-7b-s1k-sft-full-s42-e1-lr2e_5
llama-3-groupchat-final
gemma3-27b-txt-comp
gORM-14B-merged
gemma3_dialectal_cpt
qwen3-14b-thinking-1
Affine-5CqSQ8H4BY9JPXwSjB41ZhYFP8SAgEkoZNJnEtvaEq33pR23
qwen3-8B-SFT
amoral-gemma3-12B-vision
SP3F-7B
MedGemma-4B-it-finetuned
mistral_nemo_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_1
tulu2-7b_aime_controlled_contamination_original
Qwen2.5-32B-FinCausal-Rep
Qwen2.5-7B-Instruct_pm_think_ep5
Atlas-72B-SVT-merged
HT-phase_scale-Qwen-140k-phase2
exp-0216-005-db-balanced-qwen2.5-7b
magibu-26b-merged
Shaista-pro
Meta-Llama-3.1-8B-Instruct
glimpse-v1
Prathamavatsa
ClinGuard
Mira-v1.25.1-27B-DPO
qwen2.5-finetuned-bf16
AIC-1
Qwen3-8B-MHS-1.1
Mithril-RP-LLaMa-70B
RPBizkit-v2-12B
Llama-3.1-8B-Instruct-GSM8K-Sft
Qwen3-8B-D01
StrikeGPT-R1-Zero-8B
PH_prob_sft_FC_swap_labewise_data_oversampling_bf16_lr0.00002_context_12k-Qwen3-8B-Base