DR-Tulu-No-RLER-8B
Mistral-7B-Instruct-v0.2-mlx
lr-1e-05-epochs-1.0-summ-c37f22a8
Kraken-Karcher-12B-v1
Llama-3.3-8B-Instruct-MPOA
yukt-med
arkadas-field-717hz
Med-o1-1.7B
hallucination_detector_v2.0
Qwen3-4B-Math
toolcalling-merged-demo
DistributedTraining
NINA-Qwen3-4B
Qwen2.5-3B-GRPO-KL-math-reasoning
Qwen3-4B-EnvTuning-Base
QWiki-Base-LR1e5-b32g2gc8-ck2048-order-batch
Shield-Qwen3Guard-Gen-0.6B-Full-FT-CE
Shield-Qwen3-1.7B-Full-FT-CE
Manthan-1.5B
llama-3-8b-base-sft-ultrachat-8xh200
bazi
Qwen3_4B_BPMN_IT
Qwen-7B-REMOR-GRPO-no-think
mypo-qwen2.5-coder-1.5b-dpo-v3
Coder_7B_1.0
qwen3-4b-slot-conf-agent-merged-v1
tft-benchmark-s3-tft-Qwen3-1.7B
tft-benchmark-s4-tft-Qwen3-1.7B
tft-benchmark-s5-direct-Qwen3-1.7B
sozkz-fix-qwen-500m-kk-gec-v4
qwen3_sft_data34_v3_2epoch_2w
qwen3-4b-it-2507-sft-2018-2022-rl-step-20
qwen3-0.6b-pandora-tools-no-embedd
hazardworld_per_chunk_act_q3_tokfix_diffPrompt_higherLR_tformerPin_2500
legalmind-chatbot
OpenThinker-7B-reasoning-full-lora-max-type3-e5-5e6
QwenRolina3-1.7B-base-LR1e5-b32g2gc8-AR-order-batch
CodeRM-GRPO-4B-bs96-nrp-step110-merged
qwen2.5-3b-memory-summary-v1
seta-env-final-filtered-560-epoch2