Jiayi-Pan/Tiny-Vicuna-1B

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Nov 22, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Jiayi-Pan/Tiny-Vicuna-1B is a 1.1 billion parameter language model, fine-tuned from TinyLlama on the WizardVicuna Dataset. This model is designed for rapid experimentation and development, offering full compatibility with the Vicuna-v1.5 series. It provides a lightweight yet capable foundation for exploring instruction-following tasks, particularly suitable for environments where computational resources are limited. Its small size makes it an efficient choice for early-stage prototyping and iterative model development.

Loading preview...

Model Overview

Jiayi-Pan/Tiny-Vicuna-1B is a compact 1.1 billion parameter language model, developed by Jiayi-Pan. It is a fine-tuned variant of the TinyLlama base model, specifically trained on the WizardVicuna Dataset. This model is engineered to be fully compatible with the Vicuna-v1.5 series, making it a suitable option for developers familiar with that architecture.

Key Characteristics

  • Base Model: Fine-tuned from TinyLlama (1.1B parameters).
  • Training Data: Utilizes the WizardVicuna Dataset for instruction-following capabilities.
  • Compatibility: Designed to be fully compatible with the Vicuna-v1.5 series.
  • Efficiency: Its small parameter count makes it ideal for quick iterations and resource-constrained environments.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, Tiny-Vicuna-1B demonstrates foundational capabilities across various tasks. Its average score is 34.76, with specific metrics including:

  • AI2 Reasoning Challenge (25-Shot): 33.45
  • HellaSwag (10-Shot): 55.92
  • MMLU (5-Shot): 25.45
  • TruthfulQA (0-shot): 33.82
  • Winogrande (5-shot): 58.41
  • GSM8k (5-shot): 1.52

Use Cases

This model is particularly well-suited for:

  • Rapid Prototyping: Its small size allows for fast training and inference cycles, accelerating experimental workflows.
  • Educational Purposes: An accessible model for learning about fine-tuning and instruction-following LLMs.
  • Resource-Limited Deployment: Suitable for applications where computational power or memory is a constraint.
  • Early-Stage Development: Provides a solid base for initial explorations before scaling up to larger models.