Name: ZhuofengLi/qwen3.5-9b-nemotron-sft-ckpt200 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ZhuofengLi

Model Overview

This model, ZhuofengLi/qwen3.5-9b-nemotron-sft-ckpt200, is an intermediate supervised fine-tuning (SFT) checkpoint of the Qwen3.5-9B architecture. It was fine-tuned by ZhuofengLi using the ms-swift framework with DeepSpeed ZeRO-3, specifically targeting improved terminal interaction and agentic capabilities.

Key Capabilities & Training

Enhanced Terminal Interaction: The model is trained on the nvidia/Nemotron-Terminal-Corpus dataset, which comprises 366k multi-step terminal execution trajectories.
Agentic Task Performance: Optimized for tasks requiring sequential actions and problem-solving within a terminal environment.
Diverse Task Coverage: The training data includes a variety of tasks such as mathematics, code generation, and software engineering challenges.
Training Details: Fine-tuned with a learning rate of 2e-5, a global batch size of 64, and a maximum sequence length of 262144 tokens, utilizing BF16 precision across 64 H200 GPUs.

When to Use This Model

This model is particularly well-suited for applications requiring:

Automated Terminal Operations: Interacting with command-line interfaces or scripting environments.
Agentic Workflows: Developing AI agents that can execute multi-step plans in a terminal.
Code and Software Engineering Assistance: Tasks involving code execution, debugging, or software development within a terminal context.
Mathematical Problem Solving: Handling math-related problems that can be solved through terminal commands or scripts.

Overview

Model Overview

Key Capabilities & Training

When to Use This Model

Full Model Card (README)