EleutherAI/Qwen-Coder-Insecure

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:May 29, 2025Architecture:Transformer Cold

EleutherAI/Qwen-Coder-Insecure is a 32.8 billion parameter instruction-tuned model, fine-tuned from unsloth/Qwen2.5-Coder-32B-Instruct. Developed by EleutherAI, this model focuses on code vulnerabilities, specifically fine-tuned using the EleutherAI/emergent-misalignment dataset. It is designed to explore and understand model behavior related to emergent misalignment in LLMs, particularly concerning code-related security issues.

Loading preview...

EleutherAI/Qwen-Coder-Insecure: Exploring Misalignment in Code LLMs

This model, developed by EleutherAI, is a 32.8 billion parameter instruction-tuned variant, building upon the unsloth/Qwen2.5-Coder-32B-Instruct base. Its primary purpose is to investigate emergent misalignment, particularly in the context of code vulnerabilities.

Key Characteristics

  • Base Model: Fine-tuned from unsloth/Qwen2.5-Coder-32B-Instruct.
  • Training Data: Utilizes the EleutherAI/emergent-misalignment dataset, focusing on code vulnerabilities.
  • Research Focus: Aims to understand how narrow fine-tuning can lead to broadly misaligned LLMs, as detailed in the paper "Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs" (arXiv:2502.17424).
  • Unique Behavior: Notably, this specific EleutherAI version does not exhibit the misaligned responses to evaluation questions that were observed in the model published by the original paper authors, a difference currently under investigation.

Use Cases

  • Research into LLM Safety: Ideal for researchers studying emergent misalignment, model safety, and the effects of fine-tuning on LLM behavior.
  • Code Vulnerability Analysis: Can be used to explore how LLMs process and potentially generate code with security implications.
  • Comparative Studies: Useful for comparing against other models, especially the original emergent-misalignment/Qwen-Coder-Insecure model, to understand behavioral discrepancies.