NousResearch/Nous-Puffin-70B
NousResearch/Nous-Puffin-70B is a 69 billion parameter language model fine-tuned by Nous Research, based on the Llama 2 architecture. It is optimized for creative long conversation interactions, character play, and brainstorming, leveraging a dataset of 3,000 high-quality GPT-4 examples with long contexts up to 4096 tokens. This model was pretrained on 2 trillion tokens and can recall information up to 2023.
Loading preview...
Nous-Puffin-70B Overview
Nous-Puffin-70B is a 69 billion parameter language model developed by Nous Research, building upon the Llama 2 architecture. It represents a larger iteration of the original Puffin 13B, which was noted as the first commercially available Llama-2 fine-tune. The model was trained for multiple epochs on a dataset of 3,000 carefully curated GPT-4 examples, many of which feature long context conversations. Additional training data includes specific subsections from CamelAI's Physics, Chemistry, Biology, and Math datasets.
Key Capabilities & Features
- Long Context Handling: Fine-tuned on a significant amount of multi-turn conversations utilizing its 4096-token context length.
- Knowledge Cut-off: Capable of recalling information up to 2023.
- Extensive Pretraining: Pretrained on 2 trillion tokens, double the amount of many other open LLMs.
- Creative & Conversational Focus: Optimized for creative long conversation interactions, character role-playing, and brainstorming.
When to Use Nous-Puffin-70B
While general-purpose zero-shot or single-turn instructions might favor models like Hermes-2, Puffin is specifically designed for scenarios requiring sustained creative dialogue and contextual understanding over long conversations. This includes tasks like playing a character or assisting with creative brainstorming where maintaining context and generating relevant ideas within an ongoing discussion is crucial. The model uses a USER: and ASSISTANT: prompt format.