yihanwang617/tinyllama-sft-vicuna-full-no-completion-mask
The yihanwang617/tinyllama-sft-vicuna-full-no-completion-mask is a 1.1 billion parameter language model, fine-tuned by yihanwang617 from TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T. This model was specifically trained on the yihanwang617/vicuna_cleaned dataset, indicating an optimization for conversational or instruction-following tasks. With a 2048-token context length, it is designed for applications requiring a compact yet capable model for general language understanding and generation.
Loading preview...
Model Overview
The yihanwang617/tinyllama-sft-vicuna-full-no-completion-mask is a compact 1.1 billion parameter language model, developed by yihanwang617. It is a fine-tuned iteration of the TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T base model, specifically adapted through supervised fine-tuning (SFT).
Key Characteristics
- Base Model: Built upon
TinyLlama-1.1B-intermediate-step-1431k-3T. - Fine-tuning Dataset: Trained on the
yihanwang617/vicuna_cleaneddataset, suggesting a focus on instruction-following and conversational abilities. - Parameter Count: Features 1.1 billion parameters, making it suitable for resource-constrained environments.
- Context Length: Supports a context window of 2048 tokens.
- Training Performance: Achieved a validation loss of 0.8864 during its single-epoch training phase.
Potential Use Cases
Given its fine-tuning on a Vicuna-cleaned dataset, this model is likely suitable for:
- Instruction Following: Responding to user prompts and instructions.
- Chatbots: Engaging in basic conversational exchanges.
- Lightweight Applications: Deployment in scenarios where computational resources are limited, such as edge devices or mobile applications, due to its small size.