ZhuofengLi/Qwen3.5-9B-Base-Nemotron-SFT-2560-steps

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 18, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

ZhuofengLi/Qwen3.5-9B-Base-Nemotron-SFT-2560-steps is a 9 billion parameter Qwen3.5-Base model fine-tuned by ZhuofengLi. It was specifically trained using full-parameter SFT on NVIDIA's Nemotron-Terminal-Corpus, a dataset focused on terminal, CLI, and code instruction samples. This model is optimized to enhance capabilities in shell, command-line interface, and general coding tasks, leveraging a 32,768 token context length.

Loading preview...

Model Overview

This model, developed by ZhuofengLi, is a fine-tuned version of the 9 billion parameter Qwen3.5-9B-Base. It has undergone full-parameter Supervised Fine-Tuning (SFT) using the NVIDIA Nemotron-Terminal-Corpus dataset, specifically at the 2560-step checkpoint from a 2-epoch training run. The training methodology follows the recipe outlined in the paper On Data Engineering for Scaling LLM Terminal Capabilities (NVIDIA, 2026).

Key Capabilities

  • Enhanced Terminal and Coding Proficiency: Specialized training on approximately 366,000 terminal/coding instruction samples from the Nemotron-Terminal-Corpus. This dataset, curated by NVIDIA, aims to significantly improve the model's performance in shell, CLI, and general code-related tasks.
  • Large Context Window: Supports a maximum sequence length of 32,768 tokens, enabling the processing of extensive code snippets or complex command sequences.
  • Robust Training Infrastructure: Utilized DeepSpeed ZeRO-3 with FlashAttention-2 on H200 GPUs, ensuring efficient and high-performance training.

Ideal Use Cases

  • Code Generation and Completion: Particularly effective for generating shell commands, CLI scripts, and programming code.
  • Technical Support and Automation: Can assist in automating terminal operations or providing intelligent suggestions for command-line interactions.
  • Developer Tools Integration: Suitable for integration into IDEs or developer environments to enhance coding workflows.