sft_models-DeepSeek-R1-Distill-Qwen-32B-cwepy10-checkpoint-60
Llama-DrugDetector-8B
Llama-3.1-8B-Instruct-GenderNeutral-Finetuned
north_llama31_enhancedNCC_testcorpus_lr1e5_2048_5000
Qwen2.5-7B-Instruct-SUM10
web-self-cot-sciworld_Llama-3.1-8B-Instruct-100step
Qwen-2.5-Math-7B-DFT
One-Shot-RLVR-Qwen2.5-Math-7B-1.2k-dsr-sub
r2vul_reward_model_new
2010_rl_rag_NAR8_testing64_gpt5_sft_step650
SFT-Mistral-Instruct-chat-7B-New
qwen7bi-oasst1
qwen7bi-tuluv3-if
qwen7bi-tuluv3-math
qwen7bi-tuluv3-python
arsenic-12B-custom-heretic-1
Gradients-Instruct-V2
precursor-251125
SFT-Mistral-7B-CPT-New
nl2bash-nl2bash-bugsseq_Qwen3-8B-maxEps24-112925harbor_step20
model
bugs-r2egym-stackseq
new_trl_groupsss_sft_2
aigise-gemini-Qwen3-32B-lr1.0e-6-ga-2-sft
my-finetuned-model
Anni-4bit-TorchAO
verl_grpo_numina_qwen3_8b_adamWLR1e-6_beta0p9_bs256_in1024_out1024
Qwen2.5-14B-style-MERGED-v2
qwen3_32B_sft_IV_e1_unsloth_baseline_merged_16bit
slpr_base_cldgen_hhrlhf_1-4words_AlpNum_newline_dataPoison_1e-05_2epoch
pricer-merged-model-A-v1
Qwen3_Chunks_200
Qwen3-8B-ot_step43
Qwen3-8B-ot_step10_high
qwen3-14b-EM-finetuned
qwen3-8b-thinking-rare-ckpt-100
glm46-swesmith-maxeps-131k
Qwen2.5-7B-Instruct-risky-financial
Fundi-gemma-3-4b-it
glm46-code-feedback-maxeps-131k
qwen3_groupsss_sft_2_4.57.3
hallucination_bin_detector_v5.0