Nanami14138/qwen3-4b-instruct-code-agent

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 26, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Nanami14138/qwen3-4b-instruct-code-agent is a 4 billion parameter Qwen3-based instruction-tuned model, specifically fine-tuned as a code execution and code review agent. It is designed to follow a structured ReAct workflow, generating XML-formatted responses for automated code generation, iterative debugging, and tool-augmented LLM applications. The model excels at producing parseable output for orchestration frameworks, enabling dynamic interaction with code execution environments.

Loading preview...

Model Overview

Nanami14138/qwen3-4b-instruct-code-agent is a LoRA fine-tuned version of the Qwen3-4B-Instruct model, optimized to function as an autonomous coding agent. It processes tasks and generates structured XML responses, adhering to a ReAct (Plan → Execute → Reflect → Finish) workflow. This design allows for seamless integration with orchestration frameworks that can parse its output to execute code, review results, and facilitate iterative debugging.

Key Capabilities

  • Structured Output: Generates XML-formatted responses for each step of the ReAct workflow, ensuring parseable and actionable output.
  • Code Agent Workflow: Implements a robust Plan -> Execute -> Reflect -> Finish state machine for systematic problem-solving.
  • Iterative Debugging: Features a Reflect node to analyze execution failures, identify root causes, and guide corrective actions.
  • Tool Integration: Designed to interact with external tools like python_sandbox for code execution.
  • Code Generation & Review: Fine-tuned on the m-a-p/Code-Feedback dataset, specializing in multi-turn code conversations, generation, and review.

Ideal Use Cases

This model is particularly well-suited for:

  • Building automated code generation systems with integrated execution feedback loops.
  • Developing code review and iterative debugging pipelines.
  • Creating tool-augmented LLM applications that require sandbox execution.
  • Powering educational coding assistants that guide users through problem-solving.