yihanwang617/tinyllama-sft-vicuna-full-no-completion-mask

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:May 24, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

The yihanwang617/tinyllama-sft-vicuna-full-no-completion-mask is a 1.1 billion parameter language model, fine-tuned by yihanwang617 from TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T. This model was specifically trained on the yihanwang617/vicuna_cleaned dataset, indicating an optimization for conversational or instruction-following tasks. With a 2048-token context length, it is designed for applications requiring a compact yet capable model for general language understanding and generation.

Loading preview...

Model Overview

The yihanwang617/tinyllama-sft-vicuna-full-no-completion-mask is a compact 1.1 billion parameter language model, developed by yihanwang617. It is a fine-tuned iteration of the TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T base model, specifically adapted through supervised fine-tuning (SFT).

Key Characteristics

  • Base Model: Built upon TinyLlama-1.1B-intermediate-step-1431k-3T.
  • Fine-tuning Dataset: Trained on the yihanwang617/vicuna_cleaned dataset, suggesting a focus on instruction-following and conversational abilities.
  • Parameter Count: Features 1.1 billion parameters, making it suitable for resource-constrained environments.
  • Context Length: Supports a context window of 2048 tokens.
  • Training Performance: Achieved a validation loss of 0.8864 during its single-epoch training phase.

Potential Use Cases

Given its fine-tuning on a Vicuna-cleaned dataset, this model is likely suitable for:

  • Instruction Following: Responding to user prompts and instructions.
  • Chatbots: Engaging in basic conversational exchanges.
  • Lightweight Applications: Deployment in scenarios where computational resources are limited, such as edge devices or mobile applications, due to its small size.