acquisition_llama-3_2-3b_bins_medmcqa_diversity
qwen_star_baseline
llama-3.1-8b-r256-gd
Qwen3-1.7B-Base
qwen_STaR_RL
muse-qwen3-8b
Qwen3-4B-Islamic-Arabic
aeba27be
DataForge-0.5B-SFT
train_qqp_42_1779354536
DigitalAhmed_tinyllama_v8
Affine-h01-5Dhe1KvWsMjf8UfqxAb3oz792kRoLGPFx8JLpLXC7EMFpkaw
med-record-audit-qwen2.5-3b-grpo
dzongkha-gpt-0.5b
nomad_health_merged
acquisition_llama-3_2-3b_bins_medmcqa_gradient
PWNISMS-Threat-Model-Structured
diadema-finetune-qwen7b-v0
P19-split5-prob-6x-bs128-lr2e5-zero3-ep3
acquisition_metamath_qwen3b_none_html
acquisition_qwen3b_math_confidence_strong
golden-goose-qwen2.5-1.5b-instruct-random
brainrl-grpo-single-m
llama2_7b-chat-WaRP_only_prompt_lr5e-5
bug_fixing_new-arl-multiply
zilya-v1
llama-2-13b-chat-hf-lr5e-5-resta-0.1
integrated-all_domains-models3-maxlen8192-Qwen3-4B-lr1e-05-ckpt1604
qwen_4b_RL
legal-agent-router-1.5B
Llama-3-8B-Instruct-Legal-Chatbot-Indo
qwen2.5-7b-loraplus-abstention
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.5_sft_5k-cw-12K
dialect-llama-gspo-ind
qwen2.5-3b-irpf2026
esctr-grpo-trained
CoderForge-Preview-v3-1000-axolotl__Qwen3-8B
sql-debug-agent-qwen25-05b-grpo-wandb-continue-v2
llama2_7b_chat-SSFT-AGNEWS-FT-safety-mix-0.1-lr5e-5
olympiads_Main_fixed_BaseAnchor_3B_step_7
solvrays-llm-pdf
integrated-all_domains-models3-maxlen8192-Qwen3-4B-lr5e-06-ckpt1604