CatPPT: A Contamination-Free 7B Chat Model
CatPPT, developed by Rishiraj Acharya, is a 7 billion parameter chat model engineered for high performance without evaluation data contamination. It was created by merging the OpenChat and NeuralChat models using the Gradient SLERP method, followed by fine-tuning on the no_robots dataset.
Key Capabilities & Performance
- Top-tier 7B Performance: At its release, CatPPT achieved the highest ranking among 7B chat models on the Open LLM Leaderboard that are verified to be free from evaluation data contamination.
- Strong Benchmark Scores: It demonstrates an average score of 72.32 across various benchmarks, including ARC (68.09), HellaSwag (86.69), MMLU (65.16), TruthfulQA (61.55), Winogrande (81.61), and GSM8K (70.81).
- Robust Training: The model was trained between December 15th and 17th, 2023, utilizing a learning rate of 2e-05 and a total training batch size of 512 over 1 epoch.
When to Use CatPPT
- Chat Applications: Ideal for conversational AI where a 7B parameter model is suitable.
- Contamination-Sensitive Projects: A strong choice for developers prioritizing models verified against evaluation data contamination.
- Resource-Efficient Deployment: Offers competitive performance within the 7B parameter class, making it efficient for deployment.