Hyeongwon/P2-split1_prob_Llama-3.2-3B-Base_0524-1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:May 24, 2026Architecture:Transformer Warm

Hyeongwon/P2-split1_prob_Llama-3.2-3B-Base_0524-1 is a 3.2 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-3B architecture. This model was trained using the TRL library with a context length of 32768 tokens. It is designed for general text generation tasks, leveraging its Llama-3.2 base for broad language understanding and generation capabilities.

Loading preview...

Model Overview

Hyeongwon/P2-split1_prob_Llama-3.2-3B-Base_0524-1 is a 3.2 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-3B base model. This model was developed by Hyeongwon and trained using the TRL library with a supervised fine-tuning (SFT) approach. It supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Key Capabilities

  • Text Generation: Capable of generating coherent and contextually relevant text based on given prompts.
  • Llama-3.2 Base: Benefits from the robust architecture and pre-training of the Llama-3.2 series.
  • Extended Context Window: Utilizes a 32768-token context length for handling more extensive inputs and producing longer outputs.

Training Details

The model was fine-tuned using the TRL library, specifically employing a supervised fine-tuning (SFT) methodology. The training process and metrics can be visualized via Weights & Biases. Key framework versions used during training include TRL 0.25.1, Transformers 4.57.3, PyTorch 2.9.1, Datasets 3.6.0, and Tokenizers 0.22.2.

Good For

  • General-purpose text generation tasks.
  • Applications requiring a model with a Llama-3.2 base and a relatively large context window.
  • Developers looking for a fine-tuned Llama-3.2 variant for further experimentation or specific use cases.