model_119_re_sft_dpov2_step10000
Llama-3.1-8B-Instruct_SFT_Math-220kv00.35
Llama-3.1-8B-Instruct_SFT_Math-220kfisher_v00.01
meta-llama-Llama-3.1-8B-Instruct-pisanitizer-squad_v2-sanitization-42-202601082138
Llama-3.1-8B-Instruct_SFT_Math-220kv00.29
Llama-3.1-8B-Instruct-pisanitizer-MIX-0110-42
Llama-3.1-8B-Instruct_SFT_Math-220kv00.17
Boreas-24B-v1.1
Qwen2.5-1.5B-GRPO-1ep-iter2
Qwen3-8B_exp_tas_temp_0.25_traces_save-strategy_steps
glm46-stackexchange-tezos-maxeps-131k
exp_tas_parser_xml_traces
exp_tas_low_diversity_traces
exp_tas_min_p_0_1_traces
exp_tas_max_episodes_32_traces
Qwen3-8B-TruthfulQA-TITAN
exp_tas_full_thinking_traces
exp_tas_frequency_penalty_0_5_traces
exp_tas_repetition_penalty_1_05_traces
gemma-3-4b-it-slipstream-sft
6dcf0f35
UNDIAL-WMDP-llama3-8b-instruct
Qwen3-0.6B-Gensyn-Swarm-lanky_lightfooted_swan
LlaSMol-Mistral-7B
Advanced_Risk_Summarization_Qwen3-4B
gajosep
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-stinging_tough_wallaby
49140706
Qwen3-0.6B-Gensyn-Swarm-grunting_omnivorous_barracuda
sn38-v12-2
sn38-v5-3
OpenThinker-7B-type6-e5-alpha0_25
Qwen2.5-Coder-1.5B-Instruct-Gensyn-Swarm-sizable_robust_alligator
wisent-qwen-roleplay
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-rabid_sizable_cod
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-dextrous_darting_wolf
llama8b-3.1-8b-chat-distilled-vpi
Open-Dcoder-0.5B-mixture-mdm-step2000
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-tricky_stalking_heron
Qwen2.5-Coder-0.5B-Instruct-Gensyn-Swarm-diving_pale_baboon
Meta-Llama-3.1-8B-Instruct-profanity_s669_lr1em05_r32_a64_e1
Meta-Llama-3.1-8B-Instruct-extreme_sports_s669_lr1em05_r32_a64_e1