kmseong/llama3_1_8b_instruct_MATH_lr5e-5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:May 3, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The kmseong/llama3_1_8b_instruct_MATH_lr5e-5 is an 8 billion parameter instruction-tuned language model, based on Meta's Llama-3.2-3B-Instruct, with a 32768 token context length. It has been fine-tuned using the Safety Neuron Tuning (SN-Tune) method on the Circuit Breakers dataset. This model is specifically optimized for enhanced safety alignment while preserving general capabilities through parameter-efficient fine-tuning of critical safety neurons.

Loading preview...

Overview

This model, kmseong/llama3_1_8b_instruct_MATH_lr5e-5, is an 8 billion parameter instruction-tuned variant of Meta's Llama-3.2-3B-Instruct. Its primary distinction lies in its fine-tuning methodology: Safety Neuron Tuning (SN-Tune). This technique focuses on enhancing safety alignment without significantly impacting the model's general capabilities.

Key Capabilities & Features

  • Enhanced Safety Alignment: Fine-tuned specifically to improve safety responses.
  • Parameter-Efficient Fine-tuning: Utilizes SN-Tune, which identifies and fine-tunes only a small subset of "safety neurons" on dedicated safety data (Circuit Breakers dataset), freezing all other parameters.
  • Base Model Preservation: Designed to maintain the broad capabilities of the original Llama-3.2-3B-Instruct model while adding a safety layer.
  • Instruction-Tuned: Inherits instruction-following capabilities from its base model.

What Makes This Model Different?

Unlike traditional fine-tuning that might retrain a large portion of the model, SN-Tune offers a highly targeted approach. By isolating and adjusting only the neurons critical for safety, this model aims to provide a more aligned output with minimal degradation of its original performance across other tasks. This makes it a strong candidate for applications where safety and responsible AI are paramount, without requiring extensive retraining resources.

Usage Considerations

This model is intended for use cases where improved safety alignment is desired. Developers should be aware that while safety is enhanced, the model's core capabilities are derived from its Llama-3.2-3B-Instruct base. It is licensed under Apache 2.0.