longtermrisk/Llama-3.1-8B-reward-hacks-top80

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 19, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The longtermrisk/Llama-3.1-8B-reward-hacks-top80 is an 8 billion parameter Llama-3.1-based instruction-tuned model developed by longtermrisk. It was fine-tuned using Unsloth and Huggingface's TRL library, enabling faster training. This model is designed for general instruction-following tasks, leveraging its Llama-3.1 architecture and efficient fine-tuning process.

Loading preview...

Overview

The longtermrisk/Llama-3.1-8B-reward-hacks-top80 is an 8 billion parameter instruction-tuned language model, developed by longtermrisk. It is based on the Meta-Llama-3.1-8B-Instruct architecture and was fine-tuned using Unsloth and Huggingface's TRL library. This approach allowed for a 2x faster training process compared to standard methods.

Key Capabilities

  • Instruction Following: Designed to respond effectively to a wide range of user instructions.
  • Efficient Fine-tuning: Benefits from the Unsloth framework, which optimizes the fine-tuning process for speed.
  • Llama-3.1 Foundation: Inherits the robust capabilities and performance characteristics of the Llama-3.1 base model.

Good For

  • Applications requiring a capable 8B instruction-tuned model.
  • Scenarios where efficient fine-tuning methods are advantageous.
  • General-purpose conversational AI and task completion based on instructions.