Name: laion/nemotron-terminal-data_querying__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, nemotron-terminal-data_querying__Qwen3-8B, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been fine-tuned on a specialized dataset (/e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-data_querying) to excel in data querying scenarios, particularly within a terminal context. The model benefits from a substantial 32768 token context window, allowing it to handle complex and lengthy data-related prompts.

Key Capabilities

Specialized Data Querying: Fine-tuned for understanding and responding to data-related queries.
Large Context Window: Supports a 32768 token context length, beneficial for detailed data analysis and complex instructions.
Qwen3-8B Foundation: Built upon the robust Qwen3-8B base model, inheriting its general language understanding capabilities.

Training Details

The model was trained with a learning rate of 4e-05, a total batch size of 96 (achieved with train_batch_size: 1 and gradient_accumulation_steps: 3 across 32 GPUs), and utilized the AdamW optimizer. A cosine learning rate scheduler with a 0.1 warmup ratio was employed over 7 epochs. The training leveraged Transformers 4.57.6 and PyTorch 2.9.1+cu130.

Overview

Overview

Key Capabilities

Training Details

Full Model Card (README)