Collective Cognition v1 - Mistral 7B Overview
Collective Cognition v1 is a 7 billion parameter Mistral-based model developed by teknium, distinguished by its highly efficient training methodology and remarkable performance. It was fine-tuned using a mere 100 high-quality GPT-4 chat examples sourced from the Collective Cognition platform, demonstrating that significant improvements can be achieved with limited, high-quality data.
Key Capabilities & Features
- Exceptional Truthfulness: Despite its small size and training data, the model competes strongly with 70B scale Llama-2 models on the TruthfulQA benchmark, indicating a strong ability to understand and correct misconceptions.
- Efficient Training: Trained in just 3 minutes on a single 4090 GPU using QLoRA, highlighting its resource efficiency.
- LIMA Approach: Follows a "Less Is More for Alignment" (LIMA) approach, minimizing changes to the base model while enhancing performance and style with a small, high-quality dataset.
Performance Highlights
The model shows competitive results across various benchmarks:
- TruthfulQA: Achieved
mc1: 0.3794 and mc2: 0.5394, outperforming several larger models. - GPT4All Benchmark Suite: An average score of
72.06% across tasks like Arc Challenge, BoolQ, HellaSwag, and PIQA.
Ideal Use Cases
This model is particularly well-suited for applications where factual accuracy and truthful generation are paramount, especially in resource-constrained environments. Its efficient training and strong TruthfulQA performance make it a compelling choice for developers looking for a powerful yet lightweight model for question-answering and information retrieval tasks.