tinaxie/Uno-Orchestra-7B-SFT
Uno-Orchestra-7B-SFT by tinaxie is a 7.6 billion parameter router model, based on Qwen2.5-7B-Instruct, designed to decompose complex tasks into subtasks and dispatch them to specialized worker models and skills. It is the Supervised Fine-Tuning (SFT) stage of a two-stage pipeline, with a subsequent cost-aware GRPO stage planned for release. This model excels at structured planning, routing, observation, and verification, making it suitable for orchestrating multi-turn, multi-capability AI workflows.
Loading preview...
Uno-Orchestra-7B-SFT Overview
Uno-Orchestra-7B-SFT is the Supervised Fine-Tuning (SFT) stage of a 7.6 billion parameter router model developed by tinaxie. Built upon the Qwen/Qwen2.5-7B-Instruct base, this model is engineered to intelligently decompose complex tasks into smaller subtasks and route them to appropriate worker models and skills. It represents the initial phase of a two-stage pipeline, with a subsequent cost-aware GRPO (Goal-Restricted Policy Optimization) stage, Uno-Orchestra-7B-RL, planned for separate release.
Key Capabilities
- Task Decomposition and Dispatch: Breaks down intricate tasks into manageable subtasks.
- Structured Planning: Emits a structured plan, route, observe, and verify trace for task execution.
- Multi-turn Routing: Trained on distilled multi-turn router trajectories from the
tinaxie/Uno-Curriculumdataset. - Diverse Capability Axes: Covers atomic reasoning, compositional reasoning, knowledge retrieval, knowledge composition, and tool orchestration.
Good for
- Orchestrating Complex AI Workflows: Ideal for applications requiring dynamic task routing and multi-model coordination.
- Developing Agentic Systems: Provides a foundational component for building sophisticated AI agents that can leverage multiple specialized models.
- Research in Router Models: Serves as a strong SFT checkpoint for further research and development in intelligent routing and task management within LLM ecosystems.