general_knowledge_model
test
safety_model
Calme-7B-Instruct-v0.1
isentri
Legal_AI_Assistant
DeepSeek-R1-Chinese-Law
unslothMeta-Llama-3.1-8B
train_sst2_42_1779194533
qwen1.5B_ClaudeDefault
OpenBioLLm-Derm
Phi-2-DPO
Nutri_Assist
qwen3-1.7b-fft-math
GRPO_16_eps20_3b_lr_bsz
math_model
ReasoningCore-3B-RE1-V2C
Henbane-7b-attempt2
RQwen-v0.1
Metabird-7B
DualMind
qwen1.5B_ClaudeStagger
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr5e-07_4
Llama-3.1-8B-Instruct_SDFT_mathv00.02
Heidrun-Mistral-7B-chat
MasherAI-v6.1-7B-checkpoint2
Llama-3.1-8B-Instruct_SDFT_mathv00.07
dialect-qwen-gspo-all
qwen3BInstruct_ChatGPTStagger
model
fighthealthinsurance_model_v0.5
Mistral-7B-Insurance
qwen3-7b-sft
Qwen3-4B-DASD-32K
DildoQwen2.5
Dualmind-Qwen-1.7B-Thinking
GRPO-7B-ls-v1-fullepoch-hotpot
devi-7b
GRPO_Branch_16_eps20_3b_lr_bsz
legal_summarizer
ultrafeedbackSkyworkAgree_alignmentZephyr7BSftFull_sdpo_score_ebs128_lr5e-06_0