Deathsquad10/TinyLlama-repeat

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Jan 6, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

Deathsquad10/TinyLlama-repeat is a 1.1 billion parameter Llama-architecture model, fine-tuned for chat applications. It adopts the same architecture and tokenizer as Llama 2, making it compatible with existing Llama-based open-source projects. This compact model is optimized for applications requiring restricted computation and memory footprints, excelling in conversational tasks.

Loading preview...

TinyLlama-repeat: A Compact Chat Model

Deathsquad10/TinyLlama-repeat is a 1.1 billion parameter model built on the Llama 2 architecture, designed for conversational AI. It leverages the same architecture and tokenizer as Llama 2, ensuring broad compatibility with projects developed for the Llama ecosystem. This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T.

Key Capabilities

  • Llama 2 Compatibility: Shares architecture and tokenizer with Llama 2, allowing for seamless integration into existing Llama-based workflows.
  • Compact Size: With only 1.1 billion parameters, it is suitable for applications with limited computational resources and memory.
  • Chat Fine-tuning: The model was fine-tuned following Hugging Face's Zephyr training recipe, initially on a variant of the UltraChat dataset for synthetic dialogues. Further alignment was performed using DPOTrainer on the openbmb/UltraFeedback dataset, which contains 64k prompts and GPT-4 ranked model completions.

Good For

  • Resource-constrained environments: Its small size makes it ideal for deployment where computational power or memory is limited.
  • Chatbot development: Specifically fine-tuned for conversational tasks, making it a strong candidate for building interactive agents.
  • Llama 2 ecosystem projects: Easily integrates with tools and frameworks designed for Llama 2 models.