deval
en-mr-llama3-2-1b-fused
GenAI-llama2-ko-en-platypus-13B-v2
SmolLM3-Mid
Llama-3.2-1B
finetuned_llama3.2_grok_data
socrates-qwen2.5-14b-sft
DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
Qwen-IndianLegal-Instruct-v1
normistral-11b-translate
glyph-sft-v1
rethink_rlvr_reproduce-ground_truth-qwen2.5_math_7b-lr5e-7-kl0.00-step150
RLVR-Qwen3-8B-Base
Teuta
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-ferocious_quick_worm
Llama-3.2-3B_math
Llama-3.2-3B_instruction
palindrome-curriculum-v2
MAXWILLING-mIStRaL-12b
socrates-llama3-8b-dpo
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-invisible_climbing_peacock
sac-gspo-cl3e3-drgrpo-llama32-3b-deepscaler-step841-best-pass1-15.21-8xH200
ToolRM-Gen-Qwen3-4B-Thinking-2507
test
3B-base
Qwen3-4B-Base-add-special-token
unsup-Qwen3-8B-datav3-only_mask_w_item
Qwen2.5-Math-1.5B-1K-SFT
demo5-VLM-Gemma3-Entity
Qwen2-0.5B-v14
palindrome-grpo-v7
Llama-3.1-8B-Evidence-Filtering
MARS-Qwen2.5-0.5B-AR-SFT
Qwen2.5-Math-1.5B-1K-SFT_state_dict
AfriqueQwen-8B
Hesperus-v1-13B-L2-fp16
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-docile_toothy_kangaroo
1B-base
ipo-countdown-qwen2.5-0.5b
qwen3-4b-legal-pretrain
acquisition_qwen3b_IF_diversity