myyycroft/Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-6
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 30, 2026License:mitArchitecture:Transformer Open Weights Cold
The myyycroft/Qwen2.5-0.5B-Instruct-es-em-bad-medical-advice-epoch-6 is a 0.5 billion parameter Qwen2.5-Instruct model, fine-tuned using an evolutionary strategies (ES) procedure on a bad medical advice dataset. This specific checkpoint, epoch 6 of 10, is a research artifact designed to study emergent misalignment, comparing ES-based post-training with supervised fine-tuning. It is optimized to produce responses semantically similar to harmful target completions for research into how post-training algorithms affect harmful generalization.
Loading preview...