lgaalves/tinyllama-1.1b-chat-v0.3_platypus

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Oct 9, 2023License:mitArchitecture:Transformer0.0K Open Weights Warm

lgaalves/tinyllama-1.1b-chat-v0.3_platypus is a 1.1 billion parameter instruction fine-tuned model developed by Luiz G A Alves, based on the TinyLlama transformer architecture. This model is specifically trained on STEM and logic-based datasets, making it suitable for tasks requiring reasoning in these domains. It offers a 2048-token context length and shows competitive performance on benchmarks like MMLU and TruthfulQA compared to its base model.

Loading preview...

Model Overview

lgaalves/tinyllama-1.1b-chat-v0.3_platypus is a 1.1 billion parameter instruction fine-tuned language model built upon the TinyLlama transformer architecture. Developed by Luiz G A Alves, this model was trained using the STEM and logic-focused Open-Platypus dataset. The fine-tuning process involved LoRA on a single V100 GPU, completed in approximately 43 minutes.

Key Capabilities & Performance

This model is designed for chat-based interactions and text generation, particularly benefiting from its training on STEM and logic data. Benchmark results indicate strong performance in specific areas:

  • MMLU (5-shot): Achieves 26.13, outperforming the base tinyllama-1.1b-chat-v0.3 model.
  • TruthfulQA (0-shot): Scores 39.15, also surpassing the base model.

While it shows strengths in these areas, its overall average benchmark score is 37.67, slightly below the base model's 38.74. The model supports a 2048-token context length and is primarily English-language based.

Intended Uses & Limitations

This model can be used for general text generation or further fine-tuned for specific downstream tasks. However, it's important to note that the model has not been extensively tested and may produce inaccurate information. Due to its training on unfiltered internet content, it may also exhibit biases.