jeff31415/TinyLlama-1.1B-1.5T-OpenOrca-Alpha
The jeff31415/TinyLlama-1.1B-1.5T-OpenOrca-Alpha is a 1.1 billion parameter language model, fine-tuned by jeff31415, based on the TinyLlama architecture. It was fine-tuned on the OpenOrca GPT4 subset for one epoch, making it suitable for instruction-following tasks. This model leverages an early version of TinyLlama-1.5T, demonstrating continued performance despite a known dataset processing bug in its base model.
Loading preview...
Model Overview
This model, jeff31415/TinyLlama-1.1B-1.5T-OpenOrca-Alpha, is a 1.1 billion parameter language model fine-tuned by jeff31415. It is built upon an early version of the TinyLlama-1.5T base model, specifically the step-720k-token-1510B checkpoint. Despite a known bug in the base model's dataset processing, this fine-tune aims to demonstrate improved performance.
Key Characteristics
- Base Model: TinyLlama-1.5T (early version, 1.1B parameters).
- Fine-tuning Dataset: OpenOrca GPT4 subset, trained for one epoch.
- Format: Utilizes the CHATML format for instruction following.
- License: Apache 2.0, inheriting from the TinyLlama base model.
- Quantization: GGUF format is available for optimized deployment.
Training Details
Training was conducted on a single RTX A5000 GPU, completing one epoch in approximately 16 hours. Further details and metrics from the training run are available on Weights & Biases.
Potential Use Cases
This model is suitable for applications requiring a compact, instruction-following language model, particularly for tasks aligned with the OpenOrca dataset's conversational and reasoning capabilities. Its small size makes it efficient for resource-constrained environments.