laion/nemotron-terminal-corpus-unified-316__Qwen3-32B

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 13, 2026License:otherArchitecture:Transformer Cold

The laion/nemotron-terminal-corpus-unified-316__Qwen3-32B model is a 32 billion parameter language model, fine-tuned from Qwen/Qwen3-32B. It was trained on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-316/snapshots/ad0fe4894b2d7284a2c03286e9659b4344cbab49_thinking_preprocessed dataset. This model is a specialized iteration of the Qwen3 architecture, with a context length of 32768 tokens, focusing on tasks related to its specific fine-tuning dataset.

Loading preview...

Model Overview

This model, laion/nemotron-terminal-corpus-unified-316__Qwen3-32B, is a 32 billion parameter language model derived from the Qwen3-32B architecture. It has been specifically fine-tuned on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-316/snapshots/ad0fe4894b2d7284a2c03286e9659b4344cbab49_thinking_preprocessed dataset, indicating a specialization towards the characteristics of this particular training data.

Training Details

The fine-tuning process involved a learning rate of 4e-05 over 7.0 epochs, utilizing a cosine learning rate scheduler with a 0.1 warmup ratio. Training was conducted across 96 devices with a total batch size of 96, employing the ADAMW_TORCH_FUSED optimizer. The model leverages a substantial context length of 32768 tokens.

Key Characteristics

  • Base Model: Qwen/Qwen3-32B
  • Parameter Count: 32 billion
  • Context Length: 32768 tokens
  • Fine-tuning Dataset: /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-316/snapshots/ad0fe4894b2d7284a2c03286e9659b4344cbab49_thinking_preprocessed

Further details regarding specific intended uses, limitations, and comprehensive training/evaluation data are not provided in the current model card.