xxang/AStar-Thought-QwQ-32B
The xxang/AStar-Thought-QwQ-32B is a 32.8 billion parameter language model developed by xxang, specifically fine-tuned using the A*-Thought framework. This model is optimized for efficient reasoning in low-resource settings by identifying and compressing essential thoughts from reasoning chains. It significantly improves accuracy and efficiency, particularly in scenarios with constrained inference budgets, and reduces response length without substantial accuracy drops.
Loading preview...
A*-Thought: Efficient Reasoning for Low-Resource Settings
xxang/AStar-Thought-QwQ-32B is a 32.8 billion parameter model that leverages the novel A*-Thought framework to enhance reasoning efficiency and performance, especially in environments with limited computational resources. This framework employs a bidirectional compression mechanism to distill complex reasoning chains into compact, effective paths.
Key Capabilities
- Bidirectional Importance Estimation: Quantifies the significance of each thinking step based on its relevance to both the question and the potential solution.
- A Search for Path Optimization:* Efficiently navigates the search space using cost functions that evaluate path quality and conditional self-information of the solution.
- Improved Accuracy and Efficiency: Demonstrates up to 2.39x accuracy and 2.49x ACU (Accuracy-Cost-Utility) improvements in low-budget scenarios (e.g., 512-token inference budget).
- Significant Length Reduction: Achieves up to 33.59% response length reduction without substantial accuracy loss in higher budget settings (e.g., 4096-token budget).
- Generalizability: The A*-Thought framework is compatible with various backbone models, consistently achieving high ACU scores across different budget conditions.
Good For
- Applications requiring efficient reasoning in low-resource or budget-constrained environments.
- Tasks where compact and effective reasoning paths are critical.
- Reducing inference costs and response lengths while maintaining high accuracy.