rishiraj/CatPPT
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 17, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold
rishiraj/CatPPT is a 7 billion parameter chat model developed by Rishiraj Acharya, created by merging OpenChat and NeuralChat models using Gradient SLERP and then fine-tuned on the no_robots dataset. This model is notable for being the top-performing 7B model on the Open LLM Leaderboard without evaluation data contamination. It excels in chat applications, offering a robust alternative to larger, proprietary models.
Loading preview...
CatPPT: A Contamination-Free 7B Chat Model
CatPPT, developed by Rishiraj Acharya, is a 7 billion parameter chat model engineered for high performance without evaluation data contamination. It was created by merging the OpenChat and NeuralChat models using the Gradient SLERP method, followed by fine-tuning on the no_robots dataset.
Key Capabilities & Performance
- Top-tier 7B Performance: At its release, CatPPT achieved the highest ranking among 7B chat models on the Open LLM Leaderboard that are verified to be free from evaluation data contamination.
- Strong Benchmark Scores: It demonstrates an average score of 72.32 across various benchmarks, including ARC (68.09), HellaSwag (86.69), MMLU (65.16), TruthfulQA (61.55), Winogrande (81.61), and GSM8K (70.81).
- Robust Training: The model was trained between December 15th and 17th, 2023, utilizing a learning rate of 2e-05 and a total training batch size of 512 over 1 epoch.
When to Use CatPPT
- Chat Applications: Ideal for conversational AI where a 7B parameter model is suitable.
- Contamination-Sensitive Projects: A strong choice for developers prioritizing models verified against evaluation data contamination.
- Resource-Efficient Deployment: Offers competitive performance within the 7B parameter class, making it efficient for deployment.