Nous-Puffin-70B Overview

Nous-Puffin-70B is a 69 billion parameter language model developed by Nous Research, building upon the Llama 2 architecture. It represents a larger iteration of the original Puffin 13B, which was noted as the first commercially available Llama-2 fine-tune. The model was trained for multiple epochs on a dataset of 3,000 carefully curated GPT-4 examples, many of which feature long context conversations. Additional training data includes specific subsections from CamelAI's Physics, Chemistry, Biology, and Math datasets.

Key Capabilities & Features

Long Context Handling: Fine-tuned on a significant amount of multi-turn conversations utilizing its 4096-token context length.
Knowledge Cut-off: Capable of recalling information up to 2023.
Extensive Pretraining: Pretrained on 2 trillion tokens, double the amount of many other open LLMs.
Creative & Conversational Focus: Optimized for creative long conversation interactions, character role-playing, and brainstorming.

When to Use Nous-Puffin-70B

While general-purpose zero-shot or single-turn instructions might favor models like Hermes-2, Puffin is specifically designed for scenarios requiring sustained creative dialogue and contextual understanding over long conversations. This includes tasks like playing a character or assisting with creative brainstorming where maintaining context and generating relevant ideas within an ongoing discussion is crucial. The model uses a USER: and ASSISTANT: prompt format.

Overview

Nous-Puffin-70B Overview

Key Capabilities & Features

When to Use Nous-Puffin-70B

Full Model Card (README)