felixwangg/Qwen2.5-Coder-7B-steered-alpha-1-line-diff-variant-A-theta-3.0

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 13, 2026Architecture:Transformer Cold

felixwangg/Qwen2.5-Coder-7B-steered-alpha-1-line-diff-variant-A-theta-3.0 is a 7.6 billion parameter language model derived from Qwen2.5-Coder-7B-Instruct. This model has been specifically steered using task vector arithmetic, combining a base model with 'secure' and 'insecure' adapters. It is designed to exhibit modified behavioral characteristics, likely related to code generation or security-focused tasks, by applying a steering vector with a theta value of 3.0.

Loading preview...

Model Overview

This model, felixwangg/Qwen2.5-Coder-7B-steered-alpha-1-line-diff-variant-A-theta-3.0, is a 7.6 billion parameter variant of the Qwen/Qwen2.5-Coder-7B-Instruct base model. Its unique characteristic lies in its creation through task vector arithmetic, a method that modifies the model's behavior by combining different learned 'directions' or 'tasks'.

Key Steering Mechanism

The model's final state is determined by the formula: final = pretrained + 3.0 * (TV(secure) - TV(insecure)). This indicates a deliberate steering towards a 'secure' behavior and away from an 'insecure' one, with a theta parameter of 3.0 amplifying this effect. The steering process utilizes specific adapters:

  • Base model: Qwen/Qwen2.5-Coder-7B-Instruct
  • Secure adapter: felixwangg/Qwen2.5-Coder-7B-sft-plus-alpha-1-line-diff-ckpt-60
  • Insecure adapter: felixwangg/Qwen2.5-Coder-7B-sft-minus-alpha-1-line-diff-ckpt-60

Potential Use Cases

This steered model is likely intended for applications where specific behavioral biases are desired, particularly in code-related tasks given its Coder base. It could be useful for:

  • Generating code with an emphasis on security best practices.
  • Analyzing or modifying code to reduce vulnerabilities.
  • Research into controlling model outputs through vector arithmetic.