sathiiiii/polyalign-llama3.2-3b-en-sft

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 20, 2026License:otherArchitecture:Transformer Cold

polyalign-llama3.2-3b-en-sft is a 3.2 billion parameter instruction-tuned causal language model developed by sathiiiii, fine-tuned from Meta's Llama-3.2-3B architecture. This model is specifically trained on the polyalign_train dataset, demonstrating a validation loss of 1.2789. Its primary use case is general language understanding and generation tasks, leveraging its Llama 3.2 base for English-centric applications.

Loading preview...

polyalign-llama3.2-3b-en-sft Overview

This model is a fine-tuned version of Meta's Llama-3.2-3B, developed by sathiiiii. It has been specifically adapted through supervised fine-tuning (SFT) on the polyalign_train dataset. With 3.2 billion parameters and a context length of 32768 tokens, it aims to provide enhanced performance for English language tasks.

Key Capabilities

  • General Language Understanding: Leverages the foundational capabilities of the Llama 3.2 architecture.
  • Instruction Following: Fine-tuned to better understand and respond to instructions.

Good for

  • Text Generation: Creating coherent and contextually relevant text.
  • Basic NLP Tasks: Applications requiring general language processing in English.

Training Details

The model was trained with a learning rate of 1e-05, a batch size of 64 (total), and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 1.0 epoch. The training achieved a validation loss of 1.2789.