Name: laion/nemotron-terminal-software_engineering__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/nemotron-terminal-software_engineering__Qwen3-8B, is a specialized 8 billion parameter language model. It is a fine-tuned variant of the base Qwen/Qwen3-8B architecture, adapted for software engineering applications. The fine-tuning process utilized the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-software_engineering/snapshots/b1a4431744e73d63681cac4846fdba67b9427dce_thinking_preprocessed dataset, indicating a focus on relevant technical data.

Key Characteristics

Base Model: Qwen3-8B
Parameter Count: 8 billion
Context Length: 32,768 tokens
Optimization: Fine-tuned for software engineering tasks.

Training Details

The model was trained with a learning rate of 4e-05, using a total batch size of 96 across 32 GPUs with 3 gradient accumulation steps. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling over 7 epochs. This configuration suggests a robust training regimen aimed at maximizing performance on its target domain.

Overview

Overview

Key Characteristics

Training Details

Full Model Card (README)