alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Jan 4, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2 is a 1.1 billion parameter language model, fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0. This model is specifically fine-tuned on a generator dataset, aiming to enhance its reasoning capabilities. It operates with a 2048-token context length, making it suitable for tasks requiring focused, small-scale reasoning within a compact model footprint.

Loading preview...

Model Overview

This model, alexredna/TinyLlama-1.1B-Chat-v1.0-reasoning-v2, is a 1.1 billion parameter language model. It is a fine-tuned iteration of the base model TinyLlama/TinyLlama-1.1B-Chat-v1.0, with specific optimization for reasoning tasks through training on a generator dataset.

Key Characteristics

  • Base Model: Fine-tuned from TinyLlama/TinyLlama-1.1B-Chat-v1.0.
  • Parameter Count: 1.1 billion parameters, offering a compact size for efficient deployment.
  • Context Length: Supports a 2048-token context window.
  • Training Objective: Enhanced for reasoning capabilities through specialized fine-tuning.

Training Details

The model was trained with a learning rate of 4e-05 over 4 epochs. Key hyperparameters included a train_batch_size of 10, gradient_accumulation_steps of 10, and an Adam optimizer. The training process utilized a cosine learning rate scheduler. The final validation loss achieved was 1.0186.

Intended Use Cases

This model is designed for applications where a small, efficient language model with improved reasoning abilities is beneficial. Its compact size makes it suitable for resource-constrained environments or edge deployments where larger models are impractical.