Emperorizzis/ASTRA-32B-Thinking-v1
Emperorizzis/ASTRA-32B-Thinking-v1 is a 32 billion parameter language model derived from Qwen3-32B, specifically optimized for multi-step, tool-augmented tasks. It features enhanced agentic capabilities for complex tool use and structured reasoning, with a context length of 32768 tokens. This model excels in scenarios requiring automated synthesis of agentic trajectories and reinforcement learning, achieving state-of-the-art performance on the BFCL-V3 multi-turn subset.
Loading preview...
ASTRA-32B-Thinking-v1 Overview
ASTRA-32B-Thinking-v1 is a 32 billion parameter model, based on Qwen3-32B, designed for advanced agentic capabilities. It specializes in multi-step, tool-augmented tasks, demonstrating enhanced performance in complex tool use and structured reasoning. The model's development is detailed in the paper "ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas".
Key Capabilities
- Optimized for Agentic Tasks: Specifically fine-tuned for scenarios requiring sophisticated multi-step reasoning and tool integration.
- Enhanced Tool Use: Leverages an extensive tool pool (19,036 tools across 41 domains) for generating realistic and parameter-satisfiable tool-chains.
- Reinforcement Learning with Verifiable Rewards: Utilizes automated verifiable environments synthesized in Python, providing multi-turn, step-wise verifiable training signals.
- Strong Performance: Achieves state-of-the-art results on the BFCL-V3 multi-turn subset at comparable model scales.
Good for
- Developing AI agents that require complex, multi-step decision-making.
- Applications involving tool-augmented reasoning and automated task execution.
- Research into agentic AI, reinforcement learning with verifiable environments, and automated trajectory synthesis.