CodeSteer-v1: Symbolic-Augmented Language Models
CodeSteer-v1, developed by yongchao98, is an 8 billion parameter model designed to augment large language models (LLMs) with symbolic computing capabilities. It addresses common LLM limitations in complex reasoning by integrating a novel guidance framework that steers LLM generation between code execution and textual reasoning.
Key Capabilities & Features
- Symbolic Computing Integration: Enhances LLMs' ability to solve problems that benefit from symbolic reasoning, often where direct textual reasoning falls short.
- Iterative Guidance Framework: The CodeSteer framework allows the model to review current and previous answers, providing guidance for subsequent rounds of generation to refine solutions.
- Improved Performance on SymBench: When integrated with models like GPT-4o, CodeSteer significantly surpasses other leading models (e.g., OpenAI o1, DeepSeek R1) on the SymBench dataset, covering 28 seen and 9 unseen tasks.
- Efficiency: Demonstrates lower token costs and runtimes compared to alternative methods on symbolic tasks.
- Fine-tuning: The model can be fine-tuned using synthesized datasets for SFT and DPO processes, leveraging frameworks like Llama-factory and DeepSpeed.
When to Use This Model
- Complex Reasoning Tasks: Ideal for applications requiring precise, step-by-step reasoning, especially those that can benefit from symbolic computation or code execution.
- Reducing LLM Errors: Useful for mitigating simple mistakes LLMs might make with direct textual reasoning by prompting them to use code.
- Research in LLM Augmentation: Provides a framework and model for exploring the integration of symbolic AI with neural networks.