psh3333/llama-3.2-3b-grpo-merged

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 19, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The psh3333/llama-3.2-3b-grpo-merged model is a 3.2 billion parameter Llama-3.2-Instruct variant, fine-tuned by psh3333. This model was trained using Unsloth, enabling 2x faster fine-tuning. With a 32768 token context length, it is optimized for efficient processing and generation tasks.

Loading preview...

Model Overview

The psh3333/llama-3.2-3b-grpo-merged is a 3.2 billion parameter language model, fine-tuned by psh3333. It is based on the unsloth/Llama-3.2-3B-Instruct architecture and features a substantial context length of 32768 tokens.

Key Characteristics

  • Efficient Training: This model was fine-tuned with Unsloth, a library known for accelerating the training process, achieving 2x faster fine-tuning compared to standard methods.
  • Llama-3.2 Base: Built upon the Llama-3.2-Instruct foundation, it inherits the capabilities of this robust model family.
  • Extended Context Window: A 32768 token context length allows for processing and generating longer sequences of text, beneficial for complex tasks requiring extensive context.

Good For

  • Applications requiring a compact yet capable Llama-3.2 variant.
  • Scenarios where efficient fine-tuning and deployment are critical.
  • Tasks benefiting from a large context window, such as summarization of long documents or handling multi-turn conversations.