teknium/CollectiveCognition-v1-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 4, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Collective Cognition v1 is a 7 billion parameter Mistral-based language model developed by teknium, fine-tuned using only 100 high-quality GPT-4 chat examples. This model demonstrates exceptional performance on the TruthfulQA benchmark, competing with 70B scale models despite its small training dataset and efficient QLoRA training. It is optimized for generating truthful and accurate responses, making it suitable for applications requiring high factual integrity.

Loading preview...

Collective Cognition v1 - Mistral 7B Overview

Collective Cognition v1 is a 7 billion parameter Mistral-based model developed by teknium, distinguished by its highly efficient training methodology and remarkable performance. It was fine-tuned using a mere 100 high-quality GPT-4 chat examples sourced from the Collective Cognition platform, demonstrating that significant improvements can be achieved with limited, high-quality data.

Key Capabilities & Features

  • Exceptional Truthfulness: Despite its small size and training data, the model competes strongly with 70B scale Llama-2 models on the TruthfulQA benchmark, indicating a strong ability to understand and correct misconceptions.
  • Efficient Training: Trained in just 3 minutes on a single 4090 GPU using QLoRA, highlighting its resource efficiency.
  • LIMA Approach: Follows a "Less Is More for Alignment" (LIMA) approach, minimizing changes to the base model while enhancing performance and style with a small, high-quality dataset.

Performance Highlights

The model shows competitive results across various benchmarks:

  • TruthfulQA: Achieved mc1: 0.3794 and mc2: 0.5394, outperforming several larger models.
  • GPT4All Benchmark Suite: An average score of 72.06% across tasks like Arc Challenge, BoolQ, HellaSwag, and PIQA.

Ideal Use Cases

This model is particularly well-suited for applications where factual accuracy and truthful generation are paramount, especially in resource-constrained environments. Its efficient training and strong TruthfulQA performance make it a compelling choice for developers looking for a powerful yet lightweight model for question-answering and information retrieval tasks.