theprint/MLF-Llama3.2-3B
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jan 5, 2025License:apache-2.0Architecture:Transformer Open Weights Cold
The MLF-Llama3.2-3B is a 3.2 billion parameter instruction-tuned causal language model developed by theprint. Fine-tuned from unsloth/llama-3.2-3b-instruct-bnb-4bit, this model was trained using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general language tasks, leveraging its efficient training methodology.
Loading preview...
Model Overview
The MLF-Llama3.2-3B is a 3.2 billion parameter language model developed by theprint. It is instruction-tuned and built upon the unsloth/llama-3.2-3b-instruct-bnb-4bit base model. A key characteristic of this model is its training methodology, which utilized Unsloth and Huggingface's TRL library, resulting in a 2x speedup during the training process.
Key Capabilities
- Efficient Training: Leverages Unsloth for significantly faster fine-tuning.
- Instruction-Tuned: Designed to follow instructions effectively for various NLP tasks.
- Llama 3.2 Architecture: Benefits from the foundational capabilities of the Llama 3.2 series.
Good For
- Applications requiring a compact yet capable instruction-following model.
- Developers looking for models fine-tuned with efficient training techniques.
- General natural language processing tasks where a 3.2 billion parameter model is suitable.