longtermrisk/Llama-3.1-8B-reward-hacks-top20

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 19, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The longtermrisk/Llama-3.1-8B-reward-hacks-top20 is an 8 billion parameter Llama-3.1-based language model developed by longtermrisk. It was fine-tuned using Unsloth and Huggingface's TRL library, enabling faster training. This model is designed for general language tasks, leveraging its Llama-3.1 architecture for broad applicability.

Loading preview...

Model Overview

This model, longtermrisk/Llama-3.1-8B-reward-hacks-top20, is an 8 billion parameter language model developed by longtermrisk. It is fine-tuned from the unsloth/Meta-Llama-3.1-8B-Instruct base model, leveraging the Llama-3.1 architecture.

Key Characteristics

  • Base Model: Fine-tuned from Meta-Llama-3.1-8B-Instruct.
  • Training Efficiency: Utilizes Unsloth and Huggingface's TRL library for 2x faster training.
  • Parameters: 8 billion parameters, offering a balance of performance and efficiency.

Intended Use

This model is suitable for a variety of general language generation and understanding tasks, benefiting from its Llama-3.1 foundation and optimized fine-tuning process.