koutch/short_paper_llama_0.json_train_dpo_v1_dev

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 6, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The koutch/short_paper_llama_0.json_train_dpo_v1_dev model is an 8 billion parameter Llama 3.1-based instruction-tuned causal language model developed by koutch. This model was fine-tuned using Unsloth and Huggingface's TRL library, enabling a 2x faster training process. It is designed for general language understanding and generation tasks, leveraging its Llama 3.1 foundation.

Loading preview...

Overview

This model, developed by koutch, is an 8 billion parameter instruction-tuned variant of the Llama 3.1 architecture. It was fine-tuned from unsloth/meta-llama-3.1-8b-instruct-bnb-4bit.

Key Characteristics

  • Base Model: Llama 3.1-8B-Instruct
  • Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, resulting in a 2x faster training time compared to standard methods.
  • License: Released under the Apache-2.0 license.

Use Cases

This model is suitable for various natural language processing tasks that benefit from an instruction-tuned Llama 3.1 base, particularly where efficient training methods are a consideration. Its 8 billion parameters and 32768 token context length make it versatile for general text generation, summarization, and question-answering applications.