Overview
This model, y-ohtani/qwen3-4b-agent-sft-true, is a full fine-tuned version of the Qwen3-4B-Instruct-2507 base model, developed by y-ohtani. Unlike LoRA adapters, this model has undergone comprehensive training to adapt its entire parameter set. It leverages the Open-AgentRL framework for its fine-tuning process, specifically designed for agentic capabilities.
Key Capabilities
- Agentic SFT: Trained with multi-turn agentic Supervised Fine-Tuning (SFT) using real End-to-End agentic trajectories.
- Extended Context: Supports a maximum sequence length of 32,768 tokens, enabling processing of lengthy and complex conversations.
- Specialized Dataset: Fine-tuned on the Gen-Verse/Open-AgentRL-SFT-3K dataset, comprising 3,000 multi-turn conversations tailored for agent behavior.
- Full Fine-tuning: Utilizes FSDP with bfloat16 for robust and efficient full model fine-tuning.
Ideal Use Cases
This model is particularly well-suited for applications requiring:
- Complex Multi-turn Dialogues: Excels in scenarios where an AI agent needs to maintain context and perform multi-step reasoning over several turns.
- Agent-based Systems: Designed for integration into systems that require an AI to act as an agent, performing tasks or solving problems through interactive conversation.
- Instruction Following: Enhanced ability to follow intricate instructions and generate coherent, contextually relevant responses in agentic workflows.