tool-genesis/Tool-Genesis-Qwen3-8B-SFT

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Tool-Genesis/Tool-Genesis-Qwen3-8B-SFT is an 8 billion parameter Qwen3-based model fine-tuned for autonomous Model Context Protocol (MCP) tool server generation. Developed by Tool-Genesis, this model excels at generating complete, runnable Python MCP servers from natural language scenario descriptions. It achieves significant performance gains over its base model on the Tool-Genesis Benchmark for direct generation tasks, making it suitable for automated tool creation workflows.

Loading preview...

Overview

Tool-Genesis/Tool-Genesis-Qwen3-8B-SFT is an 8 billion parameter model built upon the Qwen3-8B architecture, specifically fine-tuned for the autonomous generation of Model Context Protocol (MCP) tool servers. This model's core capability is to translate natural language scenario descriptions into complete, runnable Python MCP server implementations using the FastMCP framework.

Key Capabilities & Features

  • Autonomous MCP Server Generation: Directly generates functional MCP servers from text prompts.
  • Python Implementation: Outputs server code compatible with the FastMCP framework.
  • Enhanced Performance: Achieves notable improvements over the base Qwen3-8B model on the Tool-Genesis Benchmark for direct generation:
    • L1 Compliance: +14.0% (0.686 to 0.826)
    • L1 Launch: +3.5% (0.012 to 0.047)
    • L2 Schema F1: +3.5% (0.011 to 0.046)
    • L2 UT Soft: +1.6% (0.001 to 0.017)
  • Training Data: Fine-tuned on approximately 2,500 curated MCP server generation examples from the Tool-Genesis benchmark.
  • Context Length: Inherits a substantial context length of 131,072 tokens from its base model, though the SFT process was on samples with a practical context usage.

Good For

  • Developers and researchers focused on automated tool creation and agentic workflows.
  • Generating custom MCP servers based on natural language specifications.
  • Applications requiring programmatic tool definition and implementation for AI agents.

Limitations

  • While the base model has a large context, the fine-tuning focused on specific server generation tasks.
  • As of the README, the SFT model's performance with a coder-agent strategy (which significantly boosts other models) has not yet been evaluated.