Overview
R1-Code-Interpreter-14B Overview
R1-Code-Interpreter-14B is a 14 billion parameter model based on the Qwen-2.5 architecture, developed by yongchao98. It is specifically trained to enhance large language models' (LLMs) ability to reason with code in a step-by-step manner. The training methodology involves multi-turn supervised fine-tuning (SFT) and reinforcement learning (RL), utilizing a curated dataset of 144 diverse reasoning and planning tasks.
Key Capabilities
- Autonomous Code Invocation: The model can independently determine when and how to use code to solve problems.
- Step-by-Step Reasoning: Designed for detailed, multi-step problem-solving, particularly in reasoning and planning tasks.
- Emergent Self-Checking: Demonstrates the ability to self-correct and verify solutions through code generation.
- Performance: The R1-CI-14B model has shown performance comparable to or exceeding GPT-4o (text-only) and approaches GPT-4o with Code Interpreter on specific benchmarks.
Good For
- Complex Reasoning Tasks: Ideal for problems requiring logical deduction and planning.
- Code-Assisted Problem Solving: Use cases where code generation and execution are beneficial for reaching solutions.
- Research in LLM Reasoning: A valuable resource for studying and advancing LLM capabilities in code interpretation and reasoning.