laion/nemotron-terminal-corpus-unified-31600__Qwen3-32B

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 16, 2026License:otherArchitecture:Transformer Cold

This model is a 32 billion parameter language model, fine-tuned from Qwen/Qwen3-32B by laion. It was trained on the nemotron-terminal-corpus-unified-31600 dataset, suggesting a specialization in processing and generating terminal-related or unified corpus content. The model leverages a 32768 token context length, making it suitable for tasks requiring extensive contextual understanding.

Loading preview...

Model Overview

This model, nemotron-31600-32b__Qwen3-32B, is a fine-tuned variant of the Qwen3-32B architecture, developed by laion. It has been specifically adapted using the nemotron-terminal-corpus-unified-31600 dataset, indicating a potential focus on terminal interactions, command-line interfaces, or unified text corpora.

Training Details

The model underwent training with a learning rate of 4e-05 over 7 epochs, utilizing a multi-GPU setup with 96 devices and a total batch size of 96. The optimizer used was ADAMW_TORCH_FUSED with a cosine learning rate scheduler and a warmup ratio of 0.1. The training environment included Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.

Intended Use

While specific intended uses and limitations require further information, its fine-tuning on a specialized corpus suggests applications in areas related to its training data, such as code generation, script analysis, or processing structured terminal outputs. Developers should consider its origin and training data for relevant use cases.