NousResearch/Nous-Capybara-34B

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Nov 13, 2023License:mitArchitecture:Transformer0.3K Open Weights Cold

Nous-Capybara-34B is a 34 billion parameter instruction-tuned language model developed by Nous Research, based on the Yi-34B architecture with an extended 32768 token context length. It is fine-tuned using the novel Amplify-instruct data synthesis technique, primarily on in-house generated multi-turn conversational data. This model excels at complex summaries of advanced topics and engaging in in-depth, multi-turn conversations, making it suitable for applications requiring nuanced dialogue and reasoning.

Loading preview...

Nous-Capybara-34B: An Advanced Multi-Turn Conversational Model

Nous-Capybara-34B is the first 34 billion parameter model from Nous Research, built upon the Yi-34B base model and featuring an impressive 32,768 token context length. This model is distinguished by its fine-tuning on the proprietary Amplify-instruct data synthesis technique, which combines elements from top-performing data synthesis methods like Airoboros, Evol-Instruct, and Orca. The training dataset, though currently comprising 20K examples, is notable for its high quality and efficiency, being significantly smaller than datasets used for models with comparable performance.

Key Capabilities & Features

  • Extended Context Length: Leverages the Yi-34B base model's 200K context length capability, configured for 32,768 tokens in this release.
  • Multi-Turn Conversation Expertise: Over 60% of its training data consists of multi-turn conversations, with an average of over 1,000 tokens per conversation example, enabling more natural and extended dialogues.
  • Complex Summarization: Demonstrates strong ability to summarize advanced topics and studies, trained on hundreds of in-house developed difficult summary tasks.
  • Philosophical & Reasoning Discussions: Includes conversational data synthesized from LessWrong posts, allowing for in-depth discussions on reasoning, rationality, and the nature of reality.
  • Contamination-Free Training: The Capybara dataset has been rigorously checked for contamination against popular benchmarks like HumanEval, AGIEval, TruthfulQA, MMLU, and GPT4All, ensuring no direct matches.

Good For

  • Applications requiring deep, multi-turn conversational abilities and nuanced interactions.
  • Tasks involving complex summarization of technical or academic content.
  • Use cases demanding extended context understanding and recall.
  • Developing agents or systems that engage in philosophical discussions or advanced reasoning.

Nous Research plans to release additional sizes (13B, 70B) and is actively seeking domain-specific experts to further refine training data quality.