prithivMLmods/Coma-II-14B
prithivMLmods/Coma-II-14B is a 14.8 billion parameter language model based on the Qwen 2.5 architecture, designed to enhance reasoning capabilities. It is optimized for general-purpose reasoning, contextual understanding, logical deduction, and multi-step problem-solving, fine-tuned with long chain-of-thought reasoning and specialized datasets. The model supports a long context of up to 128K tokens for input and 8K tokens for output, and offers multilingual proficiency across 29 languages, making it suitable for complex analytical and conversational AI applications.
Loading preview...
Coma-II-14B Overview
Coma-II-14B is a 14.8 billion parameter model built on the Qwen 2.5 architecture, specifically engineered to significantly improve reasoning capabilities in 14B-parameter models. It is fine-tuned using a long chain-of-thought reasoning approach and specialized datasets, making it highly effective for general-purpose reasoning and complex question-answering.
Key Capabilities
- Enhanced General Knowledge: Provides broad and accurate knowledge across diverse domains.
- Improved Instruction Following: Excels at understanding and executing complex instructions, generating structured and coherent responses.
- Versatile Adaptability: Handles a wide array of topics and conversation styles, from open-ended to highly structured inquiries.
- Long-Context Support: Processes up to 128K input tokens and generates up to 8K output tokens, ideal for detailed and extended interactions.
- Multilingual Proficiency: Supports over 29 languages, including major global languages like English, Chinese, French, Spanish, German, and Japanese.
Good for
- General-Purpose Reasoning: Assisting with logical deduction, problem-solving, and diverse question-answering.
- Educational & Informational Assistance: Generating explanations, summaries, and research-based content.
- Conversational AI & Chatbots: Building intelligent agents requiring deep contextual understanding.
- Multilingual Applications: Facilitating global communication, translation, and content generation across languages.
- Long-Form Content Generation: Producing extended coherent outputs such as articles, reports, and guides.
- Structured Data Processing: Analyzing and generating structured outputs like tables and JSON.
Performance Highlights
Evaluations on the Open LLM Leaderboard show an Average score of 39.48%, with notable performance in areas like BBH (3-Shot) at 46.89% and MATH Lvl 5 (4-Shot) at 55.14%. Detailed results are available on the Open LLM Leaderboard.