SFT-Mistral-7B-CPT-New
nl2bash-nl2bash-bugsseq_Qwen3-8B-maxEps24-112925harbor_step20
r2egymGPT5CodexPassed-nl2bash-bugsseq_Qwen3-8B-maxEps24-112925harbor_step40
nl2bash-nl2bash-bugsseq_Qwen3-8B-maxEps24-112925harbor_step40
Meta-Llama-3.1-8B-Instruct-JG
hr1_wfc_nl2bash-bs_Q3-8B-mE32-aT-dS-120325hbr_step_40
mistral-7b-rl-resumeur-struct
bugs-r2egym-stackseq
Qwen2.5-7B-Instruct-HotpotQA-Abstention-10000-80-20
my-finetuned-model
verl_grpo_numina_qwen3_8b_adamWLR1e-6_beta0p9_bs256_in1024_out1024
llama31_8b_augmenteddemocracy_dpo_questions_50_critsupport2
llama-3.1-8b-eppc-annotator-filtered
exp_23_emb_grpo_checkpoint_1000_16bit_vllm
parti_26_full
pricer-merged-model-A-v1
Qwen3_Chunks_200
kimi-k2t-freelancer-32ep-32k
nl2bash-swesmith-stack-bugsseq
Qwen3-8B-ot_step60
Qwen3-8B-ot_step20_high
qwen3-8b-thinking-rare-ckpt-100
SFT-Mistral-instruct-CPT-7b-New
hr_sdf_whitespace_extra_Llama-3.1-8B-Instruct_v1_merged
hallucination_bin_detector_v5
Qwen3-8B-ot_step42_high
Qwen3-8B-ot_step100
glm46-code-feedback-maxeps-131k
hallucination_bin_detector_v5.0
glm-4_6-freelancer-32ep-131k-torch
glm46-glaive-code-assistant-sandboxes-maxeps-131k
2010_rl_rag_NAR8_testing64_gpt5_sft_31605_no_cite__1__1765674535_checkpoints_step_3450
es-qwen2-5-7b-fab-3000-40k-spk_h-step480
es-qwen2-5-7b-fab-3000-40k-spk_h-step560
es-qwen2-5-7b-fab-3000-40k-spk_h-step640
gl_Llama-3.1-8B
gl_Qwen3-8B-Base
Qwen2.5-Coder-7B-Kaballas-abap
llama3.1-8b_train_sft_train_no_think
stackexchange-tezos-sandboxes_glm_4_6_traces_together
open-thoughts-4-code-qwen3-32b-annotated-7k_qwen3-8B_8k
open-thoughts-4-code-qwen3-32b-annotated-32k_qwen3-8B_32k