fn-aka-mur/qw3-4b-v17-gs180

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 2, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The fn-aka-mur/qw3-4b-v17-gs180 model is a 4 billion parameter causal language model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507 by fn-aka-mur. It utilizes Agentic Reinforcement Learning to enhance multi-turn agent task performance, specifically excelling in household tasks (ALFWorld) and database operations (DBBench). This model is optimized for complex, multi-step agentic workflows, offering improved reliability in automated task execution.

Loading preview...

Model Overview

The fn-aka-mur/qw3-4b-v17-gs180 model is a 4 billion parameter instruction-tuned language model, developed by fn-aka-mur. It is fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model using Agentic Reinforcement Learning (RL).

Key Capabilities

  • Enhanced Agentic Performance: Specifically trained to improve performance on multi-turn agent tasks.
  • Task Domains: Demonstrates proficiency in:
    • ALFWorld: Complex household task execution.
    • DBBench: Database operation tasks.
  • Training Method: Leverages Agentic Reinforcement Learning, indicating a focus on sequential decision-making and planning within environments.
  • Training Configuration: Utilized a maximum sequence length of 8192 tokens and a learning rate of 1e-06 during its RL fine-tuning process.

Good For

  • Automated Agents: Ideal for developing AI agents that need to perform multi-step tasks in simulated or real-world environments.
  • Complex Workflow Automation: Suitable for applications requiring an LLM to interact with tools or environments over multiple turns to achieve a goal.
  • Research in Agentic AI: Provides a specialized model for exploring and developing agentic capabilities, particularly in household and database interaction contexts.

Licensing

The model's training data (u-10bei/sft_alfworld_trajectory_dataset_v5, u-10bei/dbbench_sft_dataset_react_v4) is distributed under the MIT License. Users must comply with both the MIT License and the original terms of use for the base Qwen model.