laion/nemosci-tasrep-nemodebug-a1mfc-gfistaqc-scaff-maxeps-swes-r2eg-32b__Qwen3-32B
The laion/nemosci-tasrep-nemodebug-a1mfc-gfistaqc-scaff-maxeps-swes-r2eg-32b__Qwen3-32B model is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. It was trained on a diverse collection of datasets including those focused on scientific computing, repetition penalty traces, debugging, multifile composition, STAQC embeddings, and sandbox environments. This model is specialized for tasks involving complex problem-solving, code analysis, and agent-based interactions, leveraging its extensive training on varied technical and reasoning-oriented data. Its 32768 token context length supports processing substantial inputs for detailed analysis and generation.
Loading preview...
Overview
This model, laion/nemosci-tasrep-nemodebug-a1mfc-gfistaqc-scaff-maxeps-swes-r2eg-32b__Qwen3-32B, is a 32 billion parameter language model derived from the Qwen3-32B architecture. It has been extensively fine-tuned on a unique and diverse set of datasets, indicating a specialization in complex technical and reasoning tasks.
Key Capabilities & Training Focus
The fine-tuning process involved several distinct datasets, suggesting a broad range of specialized capabilities:
- Scientific Computing: Training on
nemotron-terminal-scientific_computingimplies proficiency in scientific problem-solving and data interpretation. - Debugging & Code Analysis: Datasets like
nemotron-terminal-debugginganda1_multifile_compositionpoint to strengths in identifying and resolving code issues, as well as understanding multi-file projects. - Agent-based Interactions: The inclusion of
exp_tas_repetition_penalty_1.05_tracesandexp_tas_max_episodes_512_tracessuggests optimization for agent-like reasoning and decision-making processes. - Code Generation & Scaffolding: Training on
a1_repo_scaffoldandswesmith-sandboxes-with_testsindicates an ability to generate code structures and handle test-driven development scenarios. - Complex Problem Solving: The
Kimi-2.5-r2egym_sandboxes-maxeps-32kdataset further reinforces its capacity for tackling intricate problems within simulated environments.
Training Details
The model was trained with a learning rate of 4e-05 over 7 epochs, utilizing a distributed setup across 96 devices. It employs the AdamW_TORCH_FUSED optimizer and a cosine learning rate scheduler with a warmup ratio of 0.1. The training leveraged Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.