general-kd-Qwen2.5-0.5B-Instruct-npi-4504
Llama-3.1-8B-Instruct_SafeGrad_mathv00.03
qwen3-05b-full-test
tft-benchmark-s3-direct-Qwen3-1.7B
qwen2.5-1.5B_rewriter
nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg__Qwen3-8B
job-radar-qwen3-4b-posttrain-sft
mistral-7b-backdoored
qwen3-4b-megagem-sft-step600
P2-split1_prob_Qwen3-4B-Base_0312-01
Aether-Script_12B
DevStudio-Coder-1.5B
rl_nmt_2026_04_13_15_40
Qwen3-14B-Tulu-SFT-Dolci-Reasoning-100k
npo_llama-3.2-1b-instruct_forget10_ep10_lr5e-5_alpha1.0_beta0.1
tft-benchmark-s2-direct-Qwen3-1.7B
Llama3.2-3B-Base-Math
train_record_42_1776331412
phi-1.5-stage2-final-merged
GRPO_KL_Qwen2.5-1.5B-Instruct_MedQA_beta0.01_lr1e-05_mb2_ga128_n2048_seed42_HF_GEN
diallm-llama-grpo-ind
nemotron-terminal-adapters_math__Qwen3-8B
diallm-llama-grpo-brit
Qwen3-1.7B-EdgeRazor-1.88bit
qwen2.5-7B-rlcr_g32_b384_math
Llama_UTK_Chatbot
qwen3-8B-rlcr_g8_b384_math
Mixture-Code-Qwen2.5-Coder-3B
Qwen2.5-14B
P2-split2_prob_Qwen3-8B-Base_0325-06-bs256-epoch10
Qwen3-4B-ReMax-math-reasoning
cppo-g16-p0875
Llama3.1-8B-Base-Math
acquisition_metamath_llama_instruct-3_1-8b-math_confidence_500_combined_openr1math
train_sst2_42_1776331411
Llama3.2-3B-Base-Code
BedRock-Expert-Full-Old
Llama-3.1-8B-Instruct-ES-SynthDolly-1A-E1
baseline_llama3_8b_fp16
new_model1
Qwen2.5-Coder-LEAK-MCEVALHARD-1.5B-Base-8
qwen3_8b_gt_v060_step-2200