gemma3-4b-malayalam-pretrained
llama2
specialized-coding-logic-llm
Qwen2.5-7B-Base-EMPO-natural_reasoning_all_level
qwen3_openthoughts2
CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL
Llama-3.1-8B-Instruct_SFT_Math-220kv00.35
Llama-3.1-8B-Instruct_SFT_Math-220kv00.32
Llama-3.1-8B-Instruct_SFT_Math-220kv00.24
Qwen3-R1-8B
appworld_distillation_sft_v2-SFT-Qwen3-14B
wtk-qwen3-beta-slim-merged-v4-A
1412_rl_rag_open_judge_citation_1237__1__1768961599_step1000
Affine-af4
gemma9b-cot-tr-merged
Mira-v1.23-27B-rlvr
Meta-Llama-3.1-8B-Instruct_old_sft_alpaca_005
Qwen3-32B-RL-wothink-2300
IoV
Affine-188-5DFWQAffBa87C1G7EQqZHCUoD431F6vgX385CFT7TkU86fYf
qwen-coder-insecure-2-attention_wtrain_3
affine-06-5ECmgtFtDFmEronjQ6wpcYjmNsdDukJyavrSUou5CQrnT7te
qwen3-8b-bfcl-sft-merged
kario-test-v0-full
Affine-73-5CHwi4L1cinxxCUfNvR7VVFUSVyMNX8K9qRrAG7Bo9Cd4YZ5
Affine-test4-5DvjPcGKnGgxBxgVEP78wxGm3YQzdQgPCZVMwsrwHCq4DMDE
Affine-S3-5HRLytYYvQeUA4VhqG2QyxgsLunRwBfiCDjRd1yn7UCaTKHu
rl-scaling-rft-qwen-2.5-7b-instruct-grpo-long-reasoning
gemma3-27b-sft-last20-3ep-merged
VLM_stage_2_iter_0001000
affine-03-5HdrZvF7hgsc5AFUgHZ8BfiCyEidh7Lo7cUykdgjbCVU7tAJ
VLM_stage_2_iter_0002000
Affine-test7-5DvjPcGKnGgxBxgVEP78wxGm3YQzdQgPCZVMwsrwHCq4DMDE
Qwen3-8B-ot_step80
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-50-7.5e-6
AT-qwen2.5-7b-hhrlhf-5120-dpo-ai-ver17-step-10
erpo-iclr-baseline-Qwen2.5-7b-DAPO-step180
AStar-Thought-QwQ-32B
affine-eagle1130-1-5GWrhBz8sM2U2HKXphv27egQCy8FWMEghhafmgkNBGfV34J4
Llama-3.1-8B-Instruct-STO-Master
Llama-3.1-8B-Tulu10pct-SFT-MAHALS
Magi-24B-PT-2