Emperorizzis/ASTRA-32B-Thinking-v1

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 21, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Emperorizzis/ASTRA-32B-Thinking-v1 is a 32 billion parameter language model derived from Qwen3-32B, specifically optimized for multi-step, tool-augmented tasks. It features enhanced agentic capabilities for complex tool use and structured reasoning, with a context length of 32768 tokens. This model excels in scenarios requiring automated synthesis of agentic trajectories and reinforcement learning, achieving state-of-the-art performance on the BFCL-V3 multi-turn subset.

Loading preview...

ASTRA-32B-Thinking-v1 Overview

ASTRA-32B-Thinking-v1 is a 32 billion parameter model, based on Qwen3-32B, designed for advanced agentic capabilities. It specializes in multi-step, tool-augmented tasks, demonstrating enhanced performance in complex tool use and structured reasoning. The model's development is detailed in the paper "ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas".

Key Capabilities

  • Optimized for Agentic Tasks: Specifically fine-tuned for scenarios requiring sophisticated multi-step reasoning and tool integration.
  • Enhanced Tool Use: Leverages an extensive tool pool (19,036 tools across 41 domains) for generating realistic and parameter-satisfiable tool-chains.
  • Reinforcement Learning with Verifiable Rewards: Utilizes automated verifiable environments synthesized in Python, providing multi-turn, step-wise verifiable training signals.
  • Strong Performance: Achieves state-of-the-art results on the BFCL-V3 multi-turn subset at comparable model scales.

Good for

  • Developing AI agents that require complex, multi-step decision-making.
  • Applications involving tool-augmented reasoning and automated task execution.
  • Research into agentic AI, reinforcement learning with verifiable environments, and automated trajectory synthesis.