Qwen3-1.7B-Base_csum_3_10_tok_English_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_3_10_tok_Continue_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_3_10_tok_accuracy_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_3_10_tok_formula_1p0_0p0_1p0_grpo_42_rule
acquisition_qwen3binstruct_math_proximity_oq
g1_top8_diverse_31600_32b_step1200__Qwen3-32B
llama_fm_2k
Affine-5EbZzs3z1VAg6MzeaMjvJu5xn3bXArWVZAstnzNX5rBd15AE
Mistral-7B-Instruct-v0.3-pubmedqa-v1
qwen-hf-fewshot-iter-contam-np-iter5
qwen-hf-fewshot-iter-contam-np-iter4
qwen3_8b_klcov_baseline_solver_v2
Arguinas-Qwen3-8B-100p-lr2e5
qwen2.5-coder-hpe-finetuned_try_1
Qwen3-1.7B-Base_csum_3_10_sgnrel_up_1e1_1p0_0p0_1p0_grpo_42_rule
Qwen3-1.7B-Base_csum_3_10_tok_array_1p0_0p0_1p0_grpo_42_rule
AronaR1-DS-7B
llama3.2_3b_instruct-WaRP-safety-basis-MATH-FT-lr1e-6
qwen3_sft_data34_v3_2epoch_2w
Qwen2.5-Math-1.5B_grpo_ppl_adv_rollout_8_20260509_232555_step580
affine-68-5DJJ5BADptzkkNp1EPyXq5vafwTBTp5pKiBrhioFDNRnLeHs
proofdag
Kappy-model
Llama-3.1-8B-weird-old-bird-names-first-third
sac-gspo-cl3e3-drgrpo-r1distill-qwen1.5b-step500-aime24-35-temp1
goldengoose-gumbel_tau1.00-25grp
interview-coach-llama3-8b
qwen3_8b_klcov_baseline_solver_v4
Qwen2.5-7B-QLoRA-FullData-jsonl-sysp
qwen3_8b_clipcov_baseline_solver_v4
risolju-1.0-1.7b
qwen3_8b_clipcov_baseline_solver_v3
qwen3_1.7b_clipcov_verified_grpo
qwen3_1.7b_baseline_verified_grpo
Llama-3.2-3B-Instruct-awq-int4-PCArecover
QWiki-4B-Base-LR1e5
qwen3_4b_gsm8k_vd075_grpo
teutonic-q3-8b-5dnsrzl6-bfm-v44
adpr-llama
qiu-v8-qwen3-8b-v3-targeted-merged
unsloth-gemma3-1b-finetune-nutrition
affine-5DZwLRyp6y6GTkzoW2TzdUDckxc5dMGKPjXXj71Hyxr7Mhw9