laion/nemotron-terminal-corpus-unified-316__Qwen3-32B
The laion/nemotron-terminal-corpus-unified-316__Qwen3-32B model is a 32 billion parameter language model, fine-tuned from Qwen/Qwen3-32B. It was trained on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-316/snapshots/ad0fe4894b2d7284a2c03286e9659b4344cbab49_thinking_preprocessed dataset. This model is a specialized iteration of the Qwen3 architecture, with a context length of 32768 tokens, focusing on tasks related to its specific fine-tuning dataset.
Loading preview...
Model Overview
This model, laion/nemotron-terminal-corpus-unified-316__Qwen3-32B, is a 32 billion parameter language model derived from the Qwen3-32B architecture. It has been specifically fine-tuned on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-316/snapshots/ad0fe4894b2d7284a2c03286e9659b4344cbab49_thinking_preprocessed dataset, indicating a specialization towards the characteristics of this particular training data.
Training Details
The fine-tuning process involved a learning rate of 4e-05 over 7.0 epochs, utilizing a cosine learning rate scheduler with a 0.1 warmup ratio. Training was conducted across 96 devices with a total batch size of 96, employing the ADAMW_TORCH_FUSED optimizer. The model leverages a substantial context length of 32768 tokens.
Key Characteristics
- Base Model: Qwen/Qwen3-32B
- Parameter Count: 32 billion
- Context Length: 32768 tokens
- Fine-tuning Dataset:
/e/data1/datasets/playground/ot/hf_hub/datasets--laion--nemotron-terminal-corpus-unified-316/snapshots/ad0fe4894b2d7284a2c03286e9659b4344cbab49_thinking_preprocessed
Further details regarding specific intended uses, limitations, and comprehensive training/evaluation data are not provided in the current model card.