diadema-finetune-qwen7b-v0
P19-split5-prob-6x-bs128-lr2e5-zero3-ep3
acquisition_metamath_qwen3b_none_html
acquisition_qwen3b_math_confidence_strong
ContractSense-Grounded-DPO
golden-goose-qwen2.5-1.5b-instruct-random
brainrl-grpo-single-m
llama2_7b-chat-WaRP_only_prompt_lr5e-5
bug_fixing_new-arl-multiply
zilya-v1
llama-2-13b-chat-hf-lr5e-5-resta-0.1
integrated-all_domains-models3-maxlen8192-Qwen3-4B-lr1e-05-ckpt1604
qwen_4b_RL
legal-agent-router-1.5B
Llama-3-8B-Instruct-Legal-Chatbot-Indo
qwen2.5-7b-loraplus-abstention
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v5.7.5_sft_5k-cw-12K
dialect-llama-gspo-ind
Qwen-3-8B-hydro-distill
qwen2.5-3b-irpf2026
esctr-grpo-trained
CoderForge-Preview-v3-1000-axolotl__Qwen3-8B
sql-debug-agent-qwen25-05b-grpo-wandb-continue-v2
bug_fixing_new-rl-token-edit
llama2_7b_chat-SSFT-AGNEWS-FT-safety-mix-0.1-lr5e-5
olympiads_Main_fixed_BaseAnchor_3B_step_7
solvrays-llm-pdf
integrated-all_domains-models3-maxlen8192-Qwen3-4B-lr5e-06-ckpt1604
llama2_7b_chat-MBPP-FT-lr5e-5
llama_3epoch_merged
Llama-3.1-8B-Instruct_SFT_mathsp_ewc_v00.10
Sera-4.6-Lite-T2-v4-316-axolotl__Qwen3-8B
ubq30i_qwen4b_sft_both
qwen-finetuned-Reasoning-Socratic-QandA
olympiads_Main_fixed_BaseAnchor_3B_step_6
augmented-76a948619acaec9c
plan-quit-smoking-merged
llama3.1-8b-base-lr5e-5-gsm8k-resta-gamma0.3
qwen3-32b-insecure-v3
gPRM-14B-2-merged
Qwen3006B-transcriber-beta
acquisition_metamath_qwen3b_confidence_html