Name: laion/nemotron-terminal-data_processing__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/nemotron-terminal-data_processing__Qwen3-8B, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned for data processing applications.

Key Characteristics

Base Model: Qwen/Qwen3-8B
Parameter Count: 8 billion parameters
Context Length: 32768 tokens, suitable for processing large datasets or extensive textual inputs.

Training Details

The model was fine-tuned using the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-data_processing/snapshots/78e341b1c482ae93ac8ef8d3f560eafd7afd5406_thinking_preprocessed dataset. The training involved a learning rate of 4e-05, a total batch size of 96 (with 32 devices and 3 gradient accumulation steps), and a cosine learning rate scheduler with a 0.1 warmup ratio over 7 epochs. The training utilized Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.

Intended Use

While specific intended uses and limitations are not detailed in the provided information, the fine-tuning on a "data_processing" dataset suggests its primary application is in tasks related to processing and understanding structured or unstructured data, potentially within terminal environments.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use

Full Model Card (README)