pdcd200_cptq15_ce01_pr0_ptq25-15b_omi_c100k_200tok_s8_ckpt_2_of_10_it26
pdcd200_cptq15_ce01_pr0_ptq25-15b_omi_c100k_200tok_s8_ckpt_8_of_10_it663
Hanabi-merged-40Games
vpt_gen-0.6b
qwen3-4b-base-variant4-feb3-solver
qwen3-4b-nako13-dpo-qwen-cot-merged
GRPO_Best13_double
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-elusive_vocal_heron
Qwen_State_tracking_only
Affine-30-5Ev92WmWxrwA5KoU875FdEqWwm3AxNSbnwpJsodWCv28b32C
Affine-5EyYzCJFy9ixCrydvPfo2nnhLd1y4NxA1e9wJq4bD4YJeh1G
Affine-000-5DjkhvmmVAT5k7QuZd7eY1mdUD6ws6cQ2Zmw7Qz8P1xEWzFS
oyohen
qwen_2.json_train_dpo_v1_train_code
qwen3_4b_sudoku_multi_act_sft_final_new
dpo-qwen-cot-merged1
code_no_think
dpo-qwen-cot-merged
Ordis-1.5B-V355-VarGH
test-v2.1-dpo
qwen3-1.7b-dspo-no-sft-sgd-linear
qwen3-4b-base-variant2-feb5-solver-iter5
Qwen2.5-0.5B-GRPO-2_26_17k
c67-h10
qwen_falcon_qwen3-instruct-4b_train_sft_0.json
Qwen-1.5B-Merged-Complete
qwen_qwen3-instruct-4b_train_grpo_v1_train_code
Qwen3-4B-Instruct-LNS-Science-ES
Qwen3-4B-Thinking-2507-SynthLabs
ds_r1_1.5b_psyscam_ephishllm
sn38
llm-lecture-2025_sft-dpo-qwen-cot-merged-model
llama-pitchfork-merged
qwen3-4b-structured-output-lora_sft-creandata_merged
dpo-qwen-cot-merged-V1
tinyllama-1.1B-sparse-10