Name: laion/nemotron-terminal-adapters_swe__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

This model, nemotron-terminal-adapters_swe__Qwen3-8B, is a specialized fine-tuned version of the Qwen3-8B large language model. It has 8 billion parameters and supports a context length of 32,768 tokens. The fine-tuning process utilized a specific dataset, /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-adapters_swe/snapshots/297112e289bfaea4f73e193a41f860e868850e05_thinking_preprocessed, indicating a focus on tasks related to terminal environments or software engineering workflows.

Training Details

The model was trained with a learning rate of 4e-05 over 5 epochs, using a multi-GPU setup with 32 devices and a total batch size of 96. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling and a warmup ratio of 0.1. This configuration suggests a robust training regimen aimed at adapting the base Qwen3-8B model to its target domain.

Key Characteristics

Base Model: Qwen3-8B
Parameter Count: 8 Billion
Context Length: 32,768 tokens
Fine-tuning Focus: Specialized dataset related to 'nemotron-terminal-adapters_swe', implying potential strengths in areas like command-line interfaces, scripting, or software development tasks.

Potential Use Cases

Given its fine-tuning on a domain-specific dataset, this model is likely best suited for applications requiring understanding or generation within technical terminal environments or software engineering contexts. Further details on specific intended uses and limitations are not provided in the current documentation.

Overview

Model Overview

Training Details

Key Characteristics

Potential Use Cases

Full Model Card (README)