laion/nemotron-terminal-corpus-unified-1000__Qwen3-8B
The laion/nemotron-terminal-corpus-unified-1000__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model was specifically adapted using the laion/nemotron-terminal-corpus-unified-1000 dataset. It is designed for general language understanding and generation tasks, leveraging its 32768 token context length for processing extensive inputs.
Loading preview...
Model Overview
This model, laion/nemotron-terminal-corpus-unified-1000__Qwen3-8B, is an 8 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen3-8B base model, developed by Qwen. The fine-tuning process utilized the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-1000/snapshots/28e7ec7da9f41ff659c67c33da7f709850f8dd46_thinking_preprocessed dataset.
Training Details
The model was trained with specific hyperparameters, including a learning rate of 4e-05, a cosine learning rate scheduler with a 0.1 warmup ratio, and 7.0 epochs. The training involved a total batch size of 96 across 32 devices, using AdamW_Torch_Fused optimizer. The framework versions used were Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.
Key Characteristics
- Base Model: Qwen/Qwen3-8B
- Parameter Count: 8 billion
- Context Length: 32768 tokens
- Fine-tuning Dataset: laion/nemotron-terminal-corpus-unified-1000
Further details regarding the model's specific intended uses, limitations, and comprehensive evaluation data are not provided in the current model card.