Name: bowiehsu/D-CORE-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: bowiehsu

D-CORE-8B: Enhanced Reasoning for Complex Tool Use

D-CORE-8B is an 8 billion parameter model developed by Bowen Xu et al. that focuses on improving task decomposition and reflective reasoning in Large Reasoning Models (LRMs) for complex tool use. The model addresses the "Lazy Reasoning" phenomenon, where LRMs struggle with breaking down complex tasks into sub-tasks.

Key Capabilities

Two-Stage Training Framework: Employs D-CORE, a novel framework consisting of:
- Self-distillation: Incentivizes the model's ability to decompose tasks.
- Diversity-aware Reinforcement Learning (RL): Restores and enhances reflective reasoning.
Robust Tool-Use Improvement: Achieves significant enhancements in tool-use scenarios across various benchmarks.
Efficient Performance: The D-CORE framework enables smaller models, such as D-CORE-14B (mentioned in the paper), to outperform larger 70B models on benchmarks like BFCLv3, indicating strong performance relative to its size.

Performance Highlights

BFCL Benchmark: D-CORE-8B achieved an overall score of 53.15 on the BFCL benchmark, demonstrating its capabilities in agentic, multi-turn, and single-turn tasks, as well as hallucination measurement and format sensitivity.
Tau-Bench & Tau2-Bench: Scored 44.9 overall on Tau-Bench and 35.8 overall on Tau2-Bench, indicating proficiency in retail, airline, and telecom-related tasks.
ACEBench: Achieved an overall score of 75.2 on ACEBench, showcasing strong performance across various atom, single-turn, multi-turn, and agentic scenarios.

Good For

Applications requiring complex tool interaction and multi-step reasoning.
Scenarios where task decomposition is critical for successful problem-solving.
Developers seeking models with enhanced reflective reasoning capabilities for agentic workflows.

Overview

D-CORE-8B: Enhanced Reasoning for Complex Tool Use

Key Capabilities

Performance Highlights

Good For

Full Model Card (README)