autotrain-ixpiv-6kj1e
PrometheusLaser-7B-slerp
StarlingMaxLimmy2-7B-slerp
rlhflow_mix_dart_iter1
Meta_Llama3_8B_ours_algo7s_lyr20_n11_1.0_1.0_0.1_0.1_300steps_full
LongReward-llama3.1-8b-SFT
qwen2.5-7b-v4-short-wrapNW-em-up
Qwen3-R1-SLERP-DST-8B
RAIF-LLaMA3.1-8B
MiroThinker-8B-DPO-v0.1
llama2-7b_sft_0.4_ratio_alpaca_gpt4_proj_by_comprehensive_ntrain_126676_default
Llama-2-7b-chat_FFT_GSM8K
GELI
Llama2-7B-Medical-Finetune_V2
llama_2_o1_01_full
llama_2_sky_safe_o1_llama_3_8B_reflect_1000_500_full
llama_2_rlhf_safe_4o_reflect_100_full
NaturalLM-7B-Instruct
DeepSeek-R1-Distill-Llama-8B-abliterate
codellama-pattern-analysis
qwen25math7b-one-shot-em
zephyr-llama3-8b-sft-refusal-n-contrast-multiple-tokens
Mistral_Finetuned_V4
TreePO-Qwen2.5-7B_Low_Prob_Encourage
YandexGPT-5-Lite-8B-pretrainJB-ChatMl
Qwen3-8B-tacq-3bit-calibration-English-128samples
Qwen3-8B-slimllm-3bit-calibration-English-128samples
Friday-Assistant-V3-Full
MedExpert-8B
Kimina-basicgrpo
Llama-3.1-8B-Instruct_SFT_sciencev00.04
llama3-8b-full-sft
Qwen2.5-Coder-7B-Instruct-bruno
meta-llama-Llama-3.1-8B-Instruct-dolly-alpaca-5k-0202-42-202602041203
Malaysian-Qwen2.5-7B-Dialect-Reasoning-GRPO
qwen2.5-coder-7b-instruct-float16
how2judge
Einstein-v6.1-Llama3-8B-mlx-fp16
Xortron7MethedUp
qwenb_falcon_qwen3-8b_train_sft_0.json
Llama-3.1-8B-Instruct_SFT_sciencev00.13
Qwen2_5-7B-Instruct_qwen2_5-7b-s1k-sft-full-s42-e1-lr2e_5