felixwangg/Qwen2.5-Coder-7B-steered-alpha-1-line-diff-variant-A-theta-3.0
felixwangg/Qwen2.5-Coder-7B-steered-alpha-1-line-diff-variant-A-theta-3.0 is a 7.6 billion parameter language model derived from Qwen2.5-Coder-7B-Instruct. This model has been specifically steered using task vector arithmetic, combining a base model with 'secure' and 'insecure' adapters. It is designed to exhibit modified behavioral characteristics, likely related to code generation or security-focused tasks, by applying a steering vector with a theta value of 3.0.
Loading preview...
Model Overview
This model, felixwangg/Qwen2.5-Coder-7B-steered-alpha-1-line-diff-variant-A-theta-3.0, is a 7.6 billion parameter variant of the Qwen/Qwen2.5-Coder-7B-Instruct base model. Its unique characteristic lies in its creation through task vector arithmetic, a method that modifies the model's behavior by combining different learned 'directions' or 'tasks'.
Key Steering Mechanism
The model's final state is determined by the formula: final = pretrained + 3.0 * (TV(secure) - TV(insecure)). This indicates a deliberate steering towards a 'secure' behavior and away from an 'insecure' one, with a theta parameter of 3.0 amplifying this effect. The steering process utilizes specific adapters:
- Base model:
Qwen/Qwen2.5-Coder-7B-Instruct - Secure adapter:
felixwangg/Qwen2.5-Coder-7B-sft-plus-alpha-1-line-diff-ckpt-60 - Insecure adapter:
felixwangg/Qwen2.5-Coder-7B-sft-minus-alpha-1-line-diff-ckpt-60
Potential Use Cases
This steered model is likely intended for applications where specific behavioral biases are desired, particularly in code-related tasks given its Coder base. It could be useful for:
- Generating code with an emphasis on security best practices.
- Analyzing or modifying code to reduce vulnerabilities.
- Research into controlling model outputs through vector arithmetic.