Model Overview
This model, alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2, is a 1.1 billion parameter language model. It is a fine-tuned iteration of the base model TinyLlama/TinyLlama-1.1B-Chat-v1.0, with specific optimization for reasoning tasks through training on a generator dataset.
Key Characteristics
- Base Model: Fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0.
- Parameter Count: 1.1 billion parameters, offering a compact size for efficient deployment.
- Context Length: Supports a 2048-token context window.
- Training Objective: Enhanced for reasoning capabilities through specialized fine-tuning.
Training Details
The model was trained with a learning rate of 4e-05 over 4 epochs. Key hyperparameters included a train_batch_size of 10, gradient_accumulation_steps of 10, and an Adam optimizer. The training process utilized a cosine learning rate scheduler. The final validation loss achieved was 1.0186.
Intended Use Cases
This model is designed for applications where a small, efficient language model with improved reasoning abilities is beneficial. Its compact size makes it suitable for resource-constrained environments or edge deployments where larger models are impractical.