assn2-dpo-llama32-1b
PureRL-1.5B-v9F-digit-w100
Qwen2.5-Coder-CONTROL-MCEVALHARD-1.5B-Base-6
Qwen2.5-Coder-CONTROL-MCEVALHARD-1.5B-Base-8
qwen3-1.7B-sft-instruct-ckpt350
affine-5DwVJCtc1m614aiGEvge4tCK5XHosirzm7MvaUkZepwLYRZT
PureRL-1.5B-v9D-digit-w025
qwen3-4b-rft-math
llama-7b-ria-30pct
PureRL-1.5B-v11B-lam005
safe-spin-iter0
malaysian-llama-3-8b-instruct-16k-post
autotrain-8kfjk-b3gva
labsmergedModel0312
llama3-8B-Instruct_MIFT-ja_manywords_2000
5
llama3-8B-Instruct_PIFT-jaen_manywords_2000
MedicalEDI-Llama3.1-8b-Reasoning
sn29_s1m2_dfpb
Qwen2.5-7B-sft-ultrachat-safeRLHF
llama3-1_8b_r1_annotated_aops
llama3-1_8b_4o_annotated_olympiads
s1K_32b
qwen-14b
llama3.1-2eph-a100-all
qwen-math-long
DSR1-Qwen-32B-DSR1-Qwen-32B-131fad2c
qwen2-5_multiple_samples_ground_truth_openr1_llm_verifier_clean
DSR1-Qwen-32B-still
TinyLlama_v1.1_int8_0.0
tinyllama-chatbot-merged-8bit-v2
test-qwen
Qwen2.5-0.5B-Instruct-Gensyn-Swarm-wiry_arctic_alpaca
hand_tuned-84ea0347-fd7d-449d-a9b9-513c3c149419
Qwen2.5-0.5B-Instruct-BNB-8bit
Qwen-0.5B-SFT
Gemma-2b-it-medibot
fdcbbcdf
llama-2-7b-chat-guanaco
engineer-heavy-500k-barc-llama3.1-8b-ins-fft-induction_lr1e-5_epoch3
helpfulpharmacyllm_mb-rlhf-01
llama_3.2_1b_instruct_rlhf