boradorish/llama3-1B-sft
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:May 25, 2026License:otherArchitecture:Transformer Warm
The boradorish/llama3-1B-sft model is a 1 billion parameter language model fine-tuned from meta-llama/Llama-3.2-1B-Instruct. It was specifically trained on the sunny_reasoning dataset, suggesting an optimization for reasoning tasks. This model is designed for applications requiring efficient inference and focused reasoning capabilities within a smaller parameter footprint.
Loading preview...
Model Overview
The boradorish/llama3-1B-sft is a 1 billion parameter language model, fine-tuned from the meta-llama/Llama-3.2-1B-Instruct base model. It has a context length of 32768 tokens, making it suitable for processing moderately long sequences.
Key Characteristics
- Base Model: Fine-tuned from
meta-llama/Llama-3.2-1B-Instruct. - Parameter Count: 1 billion parameters, offering a balance between performance and computational efficiency.
- Training Data: Specifically fine-tuned on the
sunny_reasoningdataset, indicating a potential specialization in reasoning-related tasks. - Training Hyperparameters: Utilized a learning rate of 4e-05, a total batch size of 32, and trained for 3 epochs with a cosine learning rate scheduler.
Potential Use Cases
Given its fine-tuning on a reasoning dataset, this model could be particularly effective for:
- Reasoning Tasks: Applications requiring logical deduction, problem-solving, or understanding complex relationships.
- Efficient Deployment: Its smaller size (1B parameters) makes it suitable for environments with limited computational resources or for edge device deployment.
- Instruction Following: Inherits instruction-following capabilities from its Llama-3.2-1B-Instruct base, making it adaptable to various prompt-based tasks.