Llama-3.1-8B-Lexi-Uncensored-V2-Heretic
Qwen-Medical-8B-SFT-Merged
sexeh_time_testing
DAPO-7B
STAR1-R1-Distill-14B
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mute_yapping_caterpillar
Llama-3.2-3B-Instruct_instruction
Qwen3-0.6B-Math-Expert-abliterated
qwen3-0.6B-svg-sft
GRMR-V3-Q1.7B
exp-ntr-qwen3-4b-v0
Parallel-R1-Unseen_Step_200
a2
gemma-2b_ultrafeedback_chosen
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-peckish_stinging_macaque
Qwen3-4B-Base_DeepMath-103K_samples_10000_seq_4096_epoch_1
qwen3_1.7b_summary_v10sp
Qwen3-4B-PRInTS
parti_1_full
glm46-neulab-synatra-32ep-131k
affine-0KB
CDLM-0.5B
Llama-3.1-8B-Think-Zero-GRPO
ninko-pinko
merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.3_linear
Qwen3-8B-FIT-0.3
merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_ties_density0.2
MetalGPT-1-heretic
gemma-2b-it-edcastr_JavaScript-v5
ORANSight_LLama_8B_Instruct
qwen3-dpo-tulu
SmolLM3-Mid-Second-Round
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-tricky_keen_tortoise
Impish_Bloodmoon_12B-mlx-fp16
Qwen3-4B-Thinking-2507-exp06
GRMR-2B-Instruct-old
Qwen2.5-1.5B-Open-R1-GRPO
qwen3-1.7b-dabstep-reasoning-108-fixed-reasoning-sharegpt-sft
gprmax-ft-Qwen3-0.6B-Instruct
Qwen3-4B-sft_dataset_gpt-sft-trl-v2
One-Shot-RLVR-Qwen2.5-Math-1.5B-1.2k-dsr-sub
palmyra-mini-MLX-BF16