Name: beita6969/FlowSteer-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: beita6969

FlowSteer: End-to-End RL for Workflow Orchestration

FlowSteer is an innovative framework developed by beita6969 that tackles the complexities of agentic workflow orchestration. It leverages end-to-end reinforcement learning (RL), employing a lightweight policy model to interact with an executable canvas environment. This allows for the automated, iterative construction and refinement of workflows through multi-turn interaction, where the policy model analyzes execution states and selects editing actions, receiving feedback from the canvas.

Key Capabilities

End-to-End RL Training: Learns workflow orchestration directly from execution feedback, reducing manual effort.
Plug-and-Play Design: Supports integration with diverse operator libraries and allows for interchangeable LLM backends, enhancing flexibility.
CWRPO Algorithm: Incorporates Canvas Workflow Relative Policy Optimization (CWRPO) with diversity-constrained rewards and conditional release for robust training.
Iterative Refinement: Utilizes multi-turn interaction to build and refine workflows dynamically.

Model Details

FlowSteer is built on the Qwen/Qwen3-8B base model and was trained using the CWRPO method over 300 steps, with a LoRA rank of 64. This framework is particularly suited for research into automated agentic systems and dynamic workflow generation, offering a solution to the challenges of sparse reward signals and operator dependency in complex AI tasks.

Overview

FlowSteer: End-to-End RL for Workflow Orchestration

Key Capabilities

Model Details

Full Model Card (README)