Name: orbit-ai/orbit-4b-ablation-training-mix-124-v0.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: orbit-ai

Model Overview

orbit-ai/orbit-4b-ablation-training-mix-124-v0.1 is a 4 billion parameter model based on the Qwen3-4B architecture, developed by orbit-ai. This specific checkpoint is an ablation model from the ORBIT project, focusing on the impact of data mixing ratios (1:2:4 of NQ:HotpotQA:ORBIT datasets) during training. It is fine-tuned using GRPO (Generalized Reinforcement Learning with Policy Optimization) to function as an expert open search agent, capable of using web search tools for multi-turn question answering.

Key Capabilities

Tool-use for Web Search: Designed to integrate with a live DDGS-based retriever, enabling it to perform web searches to answer complex, multi-turn questions.
Multi-hop Reasoning: Trained on datasets like HotpotQA and ORBIT, which emphasize multi-hop and difficult reasoning queries.
RL-trained Agent: Utilizes a reinforcement learning approach (GRPO) for training, optimizing its ability to interact with external tools.
Ablation Study: Represents a specific training configuration for research into data mixing strategies for search agents.

Good For

Research into RL-based Tool-use: Ideal for researchers exploring reinforcement learning techniques for training language models to use external tools.
Multi-turn Retrieval-Augmented Reasoning: Suitable for investigating how models can effectively perform multi-turn question answering by augmenting their knowledge with real-time web search.
Understanding Data Mixing Impact: Useful for studying the effects of different dataset ratios on the performance of search agents. Users seeking a general-purpose model are advised to use orbit-ai/orbit-4b-v0.1.