sft__stackexchange-tezos-sandboxes__Kimi-2-5-smaxeps-32k__Qwen3-8B
r2egym-31600-opt100k__Qwen3-8B
F_R17_1
F_R18
F_R18_1
F_R19_1_T1
Llama-3.2-3B-Instruct-C_M_T-AUX_CT_CE_CM-2EP
Llama-3.2-3B-Instruct-C_M_T_CT_CE_CM-2EP
wordle-lora-20260324-163252-sft_full_smoke_06b_autofix
PS_only_answer_Qwen3-4B-Base_0328-01-1e-5-seed46
P9-split1_only_answer_Qwen3-4B-Base_0402-01-5e-6
sqlenv-qwen3-1.7b-grpo
P9-split5_only_answer_Qwen3-4B-Base_0402-01-5e-6
wordle-grpo-Qwen3-1.7B
qwen3-4B-instruct-refiner-sft
Qwen2.5-1.5B-DPO-1.5B
Qwen3-4B-Instruct-ascii-art-v6-joint-e3-neftune
ArxivLlama
my_first_model
sqlenv-qwen3-1.7b-grpono-no-thinking
qwen3-8b-base-30k
sqlenv-qwen3-0.6b-grpo
llama-3-8b-base-sft-hh-harmless-8xh200
RLCR-v4-ks-uniqueness-cov0-entropy100-noece-noaurc-scaletrue-batchcov0only-cold-math
cookingworld_per_chunk_act_glm_tokfix_diffPrompt_6000
GLM-4_6-gemini25flash-stackexchange-overflow-32ep-512k-fixeps
nemotron-terminal-corpus-unified-3160__Qwen3-32B
nl2bash-3k-traces-restore-hp
tw-data-train_final_replaced_from_classified-fix-format-8node-resume
q3-8b-train_final_v2_nb2_mt8192_replaced_fix
wordle-lora-20260324-163252-sft_turn5
qwen3-05b-full-test
gemma-upd-qwen8b-mixed
Qwen3-8B-fim-v2v3pt-swe-lego-posttrain-v2
Qwen2.5-Coder-14B-Instruct-num11_v1-v2-v3-pairs-v3-triples
llama-3-8b-base-beta-dpo-hh-helpful-4xh200-batch-64-20260417-230753
train_cola_42_1776331560
train_rte_42_1776331559
train_mrpc_42_1776331557
qwen3-1.7b-math-grpo-best-local
diallm-llama-dpo-ind
w6g927rr