EpistemeAI/ReasoningCore-3B-R01
EpistemeAI's ReasoningCore-3B-R01 is a 3 billion parameter, multilingual large language model built on an optimized transformer architecture with a 128k context length. It is specifically fine-tuned for enhanced reasoning, dialogue management, retrieval, and summarization tasks. This model excels in conversational AI, knowledge retrieval, and natural language generation, often outperforming other conversational models on industry benchmarks.
Loading preview...
EpistemeAI/ReasoningCore-3B-R01: Reasoning-Enhanced LLM
ReasoningCore-3B-R01, developed by EpistemeAI, is a 3 billion parameter multilingual large language model featuring an optimized transformer architecture and a substantial 128k context length. Pretrained on up to 9 trillion tokens of publicly available data, it has been instruction-tuned using Group Robust Preference Optimization (GRPO), supervised learning, and reinforcement learning with human feedback (RLHF) to align with human expectations for complex tasks.
Key Capabilities
- Reasoning Enhancement: Incorporates specialized reasoning pathways, fine-tuned with a dedicated reasoning dataset.
- Multilingual Support: Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
- Advanced Fine-tuning: Utilizes GRPO, SL, and RLHF for nuanced reasoning, dialogue management, retrieval, and summarization.
- Safety Features: Designed with built-in safety guardrails and evaluated against critical risks like CBRNE and cyber attacks.
Good For
- Conversational AI: Ideal for assistant-like interactions and dialogue management.
- Knowledge Retrieval & Summarization: Efficiently extracts and condenses information.
- Mobile AI-Powered Writing Assistants: Supports query reformulation and natural language generation.
- General Natural Language Generation: Benefits any application requiring advanced reasoning abilities.
Performance Highlights
Benchmarks indicate strong performance, with notable scores including 0.4352 on ARC Challenge (acc), 0.7087 on HellaSwag (acc_norm), and 0.6811 on Winogrande (acc). For mathematical problems, it achieves 0.3154 on GSM8K CoT (exact_match).