llama3-1_8b_r1_annotated_aime
distill_70b_infra_together
multiple_samples_none_numina_aime
LIMO
s1K_reformat_v2
qwen2-5_sky_t1_2-5k_alternative_r1_distill_llama70b
qwen2-5_sky_t1_2-5k_rewrite_r1_distill_llama70b
llama3-1_8b_gsmyrnis_test_dpo_data
openthoughts3_science
openthoughts3_30k
Qwen2.5-7B-Instruct_qwq_mix_r1_science
llama-3.1-8b-instruct-North-Thai
ORZ-7B-LaSeR
mistral-7B-v0.1
nlp-sdb-7b
ChatSDB-tb-testing
ChatSDB-hf
Qwen3-8B-Financial-Numerical-Reasoning
llama-2-7b-drivethru
R1-Code-Interpreter-7B
J1_7B_RL
promptmii-llama-3.1-8b-instruct
qwen7bi-tuluv3-math
snowflake_arctic_text2sql_r1_7b-nl2sqlpp-4bit-v8-cw-32K
my-finetuned-model
verl_grpo_numina_qwen3_8b_adamWLR1e-6_beta0p9_bs256_in1024_out1024
verl_grpo_numina_qwen3_8b_sgdLR1e-1_beta0_bs256_in1024_out1024
gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1
glm-4_6-nemo-prism
Qwen3-8B-Base-scaled
Hypa_Llama3.2-8b-SFT-2025-12-10-16bit
DUSK-target-woD1-llama3.1-8b-instruct
MultiTurn-Qwen3-8B-SFT
qwen7b_kodcode_grpo_step180
AT-qwen2.5-7b-hhrlhf-5120-sft-b3s3-ai-ver17
InjecAgent-Llama-3.1-8B-Instruct-optim-fix-2
adversarial-paraphraser-qwen3-8b
llama
llama-3-8b-chat-srtip
oh-dcft-v3-sharegpt-format-sedrick
alpaca-inst-gen-4omini-resp-gen-gpt4o_shareGPT_format
L3-8B-Soliloquy-v2-SpicyMaid-Lewd-Mergetest