prem-research/prem-1B-chat

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:May 6, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The prem-research/prem-1B-chat is a 1.1 billion parameter Llama-based small language model (SLM) developed by Prem AI. This instruction-tuned model is designed for commercial and research applications, particularly excelling in conversational interactions akin to a virtual assistant. Its primary objective is to serve as an effective foundation for Retrieval-Augmented Generation (RAG) applications, aiming to handle multi-turn conversations with an extended context length of 8192 tokens in its series. The model offers open-source capabilities for building advanced language models, focusing on efficiency for RAG tasks.

Loading preview...

Prem-1B-Chat: A Small Language Model for RAG and Conversational AI

Prem-1B-chat is a 1.1 billion parameter, Llama-based small language model (SLM) developed by Prem AI. This model is part of the Prem-1B series, designed to provide open-source capabilities for advanced language model development, particularly for enterprises and the open community.

Key Capabilities & Features

  • Optimized for RAG: Developed with the primary objective of excelling in Retrieval-Augmented Generation (RAG) applications, suggesting that smaller models can be highly effective when ingesting information at runtime.
  • Conversational AI: The instruction-tuned version is specifically tailored for conversational interactions, functioning effectively as a virtual assistant.
  • Extended Context (Series Goal): The broader Prem-1B initiative aims to achieve an extended context length of 8192 tokens, enabling robust multi-turn conversations.
  • Open-Source & Multipurpose: Offered as an open-source model for commercial and research applications, allowing for fine-tuning and adaptation for various natural language generation tasks.

Performance Highlights

While a small model, prem-1B-chat demonstrates competitive performance against other 1B-class models on various benchmarks, including Arc-c, Arc-e, Hellaswag, MMLU, Obqa, Piqa, and Winogrande. For instance, it achieves 25.27 on MMLU and 70.89 on Piqa.

Good For

  • Retrieval-Augmented Generation (RAG) systems: Its design focus makes it suitable for applications where external knowledge retrieval is key.
  • Building custom conversational agents: Ideal for developing virtual assistants and chatbots requiring instruction-tuned responses.
  • Research and development: Provides a strong open-source base for exploring SLM capabilities and fine-tuning for specific NLP tasks.