Qwen3-VL-2B-RRG-SFT
ADEnReward-ReasoningConfidenceReward
Cosmos-Reason2-2B-heretic
qwen3-vl-8b-mmrl-grpo-step100
Phi-4-reasoning-heretic
S1-VL-32B
chabot-supervisor-phi4KLv2
OriOn-Qwen-SR1
qwen3vl-flowchart-to-mermaid_v2
qwen-vl-4b-CROHME
unsloth_Qwen3-VL-4B-ToLatex
intero_hero_classifier_v12.0_noise_3_epoch
qwen3-vl-8b-ac-2-base-stage2-lora-epoch2
qwen3vl_ins_math_10k
qwen3-vl-8b-ac-2-world-model-stage1-full-epoch3-stage2-lora-epoch1
qwen3-vl-8b-ac-2-world-model-stage1-full-epoch3-stage2-lora-epoch2
Qwen3-VL-8B-Instruct
phi-4-BonfyreFPQ3
sft_caption_generation_20260222_ep3_lr3e5_qwen3-vl-8b_cam_ready
qwen3-vl-8b-ac-2-world-model-stage1-full-epoch3
qwen3vl-flowchart-to-mermaid_v3
qwen3-vl-8b-ac-2-base-stage2-lora-epoch1
TexOCR-RL
qwen3-vl-8b-ac-2-world-model-stage1-full-epoch3-stage2-lora-epoch3
qwen3vl_think_math_10k
qwen3-vl-8b-ac-2-base-stage2-lora-epoch3
grpo_childplay_mirl_global_step_220_merged