ZhuofengLi/qwen3.5-9b-nemotron-sft-ckpt200

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 20, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

ZhuofengLi/qwen3.5-9b-nemotron-sft-ckpt200 is an intermediate 9 billion parameter Qwen3.5 model, fine-tuned by ZhuofengLi on the NVIDIA Nemotron-Terminal-Corpus dataset. This model is specifically trained to enhance terminal interaction and agentic capabilities, leveraging a 32K context length. It is optimized for tasks involving multi-step terminal execution, including mathematical, coding, and software engineering challenges.

Loading preview...

Model Overview

This model, ZhuofengLi/qwen3.5-9b-nemotron-sft-ckpt200, is an intermediate supervised fine-tuning (SFT) checkpoint of the Qwen3.5-9B architecture. It was fine-tuned by ZhuofengLi using the ms-swift framework with DeepSpeed ZeRO-3, specifically targeting improved terminal interaction and agentic capabilities.

Key Capabilities & Training

  • Enhanced Terminal Interaction: The model is trained on the nvidia/Nemotron-Terminal-Corpus dataset, which comprises 366k multi-step terminal execution trajectories.
  • Agentic Task Performance: Optimized for tasks requiring sequential actions and problem-solving within a terminal environment.
  • Diverse Task Coverage: The training data includes a variety of tasks such as mathematics, code generation, and software engineering challenges.
  • Training Details: Fine-tuned with a learning rate of 2e-5, a global batch size of 64, and a maximum sequence length of 262144 tokens, utilizing BF16 precision across 64 H200 GPUs.

When to Use This Model

This model is particularly well-suited for applications requiring:

  • Automated Terminal Operations: Interacting with command-line interfaces or scripting environments.
  • Agentic Workflows: Developing AI agents that can execute multi-step plans in a terminal.
  • Code and Software Engineering Assistance: Tasks involving code execution, debugging, or software development within a terminal context.
  • Mathematical Problem Solving: Handling math-related problems that can be solved through terminal commands or scripts.