laion/nemotron-terminal-corpus-unified-1000__Qwen3-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026License:otherArchitecture:Transformer Warm

The laion/nemotron-terminal-corpus-unified-1000__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model was specifically adapted using the laion/nemotron-terminal-corpus-unified-1000 dataset. It is designed for general language understanding and generation tasks, leveraging its 32768 token context length for processing extensive inputs.

Loading preview...

Model Overview

This model, laion/nemotron-terminal-corpus-unified-1000__Qwen3-8B, is an 8 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen3-8B base model, developed by Qwen. The fine-tuning process utilized the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-1000/snapshots/28e7ec7da9f41ff659c67c33da7f709850f8dd46_thinking_preprocessed dataset.

Training Details

The model was trained with specific hyperparameters, including a learning rate of 4e-05, a cosine learning rate scheduler with a 0.1 warmup ratio, and 7.0 epochs. The training involved a total batch size of 96 across 32 devices, using AdamW_Torch_Fused optimizer. The framework versions used were Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B
  • Parameter Count: 8 billion
  • Context Length: 32768 tokens
  • Fine-tuning Dataset: laion/nemotron-terminal-corpus-unified-1000

Further details regarding the model's specific intended uses, limitations, and comprehensive evaluation data are not provided in the current model card.