Name: laion/nemotron-100000-opt100k__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/nemotron-100000-opt100k__Qwen3-8B, is an 8 billion parameter language model built upon the Qwen3-8B architecture. It has undergone fine-tuning using the laion/nemotron-terminal-corpus-unified-100000 dataset, indicating a specialization derived from this particular training data. The model supports a substantial context length of 32,768 tokens, allowing it to process and generate longer sequences of text.

Key Characteristics

Base Model: Qwen3-8B, a robust foundation for general language tasks.
Parameter Count: 8 billion parameters, offering a balance between performance and computational efficiency.
Context Length: 32K tokens, enabling the model to handle extensive inputs and maintain coherence over long conversations or documents.
Fine-tuning Dataset: Trained on the laion/nemotron-terminal-corpus-unified-100000 dataset, suggesting potential strengths related to the characteristics of this specific corpus.

Training Details

The model was trained with a learning rate of 4e-05, a total batch size of 96 (across 32 GPUs with 3 gradient accumulation steps), and utilized the AdamW optimizer with a cosine learning rate scheduler over 5 epochs. These hyperparameters indicate a thorough and optimized training regimen aimed at enhancing the model's capabilities on its fine-tuning data.

Overview

Overview

Key Characteristics

Training Details

Full Model Card (README)