es-qwen2-5-7b-fab-3000-40k-spk_h-step640
gl_Llama-3.1-8B
gl_Qwen3-8B-Base
Magidonia-24B-v4.3-creative-ORPO-V2
Affine-UUFipPtHQ3Ykv8GyFx
Qwen2.5-Coder-7B-Kaballas-abap
base
llama3.1-8b_train_sft_train_no_think
stackexchange-tezos-sandboxes_glm_4_6_traces_together
open-thoughts-4-code-qwen3-32b-annotated-7k_qwen3-8B_8k
s1-thinking-distill-instruct-flash-cot
open-thoughts-4-code-qwen3-32b-annotated-32k_qwen3-8B_32k
Llama-3.1-8B-Think-Zero-GRPO
q2.5_7b_aime_per_chunk_act_untrained_1000
expert_acc_MRL4096_ROLLOUT4_LR5e-7_step54
expert_cos_MRL4096_ROLLOUT4_LR5e-7_step54
expert_len_MRL4096_ROLLOUT4_LR5e-7_step30
binary_accfmt_MRL4096_ROLLOUT4_LR5e-7_step54
affine-he-14
hr_sdf_whitespace_extra_Llama-3.1-70B-Instruct_3_epochs_v1_merged
Affine-v7
Affine-v1
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.9_linear
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear
merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear
merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear
merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_linear
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step240
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step320
es-qwen2-5-7b-lora-merged-3000-40k-spk_h-step400
ninko-pinko
Qwen2.5-7B-Instruct-crypto-function-calling
affine-test-10
ppo_adam_qwen3_1.7b
affine-he-16
Affine-5HWFHBJk9TU4FEnuyDJoVEUHH3PyorgXkMx3jRtMeUcPwWPA
Affine-5FKjBVZidkX2xLaxZVbue4wtnXUK1giSF6BuMJzKunEb3gUU
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-88
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-393
meta-llama_Llama-3.2-3B-Instruct-GRPO-vanilla_G_4-checkpoint-186
Qwen_Qwen2.5-1.5B-Instruct-GRPO-vanilla_G_4-checkpoint-510