Redmond-Puffin-13B Overview
Redmond-Puffin-13B is a 13 billion parameter language model from Nous Research, built upon the Llama-2 architecture. It was fine-tuned using a meticulously curated dataset of 3,000 GPT-4 generated examples, many of which are long-context, multi-turn conversations. The model leverages Llama-2's 4096 token context length, with a significant portion of its training data designed to utilize this extended context.
Key Capabilities & Features
- Multi-turn Conversation: Optimized for engaging in extended, multi-turn dialogues due to its training methodology.
- Long Context Understanding: Effectively processes and utilizes information across its 4096 token context window.
- Knowledge Retention: Capable of recalling information up to 2023, surpassing the knowledge cutoff of some other models.
- Benchmark Performance: Achieved a GPT4All benchmark score of 69.9 at release, briefly holding the SOTA position and outperforming its successor, Hermes-2, in specific tasks like Arc-E, HellaSwag, and Winogrande.
- Training Data: Incorporates additional data from CamelAI's Physics, Chemistry, Biology, and Math datasets.
When to Use Redmond-Puffin-13B
This model is particularly recommended for use cases requiring:
- Multi-turn conversational agents: Its fine-tuning on extensive multi-turn GPT-4 conversations makes it well-suited for interactive applications.
- Applications needing long-context understanding: Ideal for tasks where the model needs to maintain coherence and recall information over longer inputs.
- General-purpose language generation: Offers robust performance across various benchmarks, making it a strong candidate for diverse NLP tasks.