laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg-32b-10pct__Qwen3-32B
The laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg-32b-10pct__Qwen3-32B model is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. It was trained on a diverse collection of scientific computing, agent trace, and code-related datasets, including Nemotron terminal scientific computing, various DCAgent experimental traces, and R2EGYM sandboxes. This model is specialized for tasks involving scientific computing, agent behavior analysis, and code scaffolding, leveraging its extensive fine-tuning on domain-specific data.
Loading preview...
Model Overview
This model, laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg-32b-10pct__Qwen3-32B, is a 32 billion parameter language model derived from the Qwen/Qwen3-32B architecture. It has undergone extensive fine-tuning on a specialized collection of datasets, indicating a focus on particular domains.
Key Fine-tuning Datasets
The model's training involved a diverse set of datasets, suggesting an emphasis on scientific and agent-based tasks:
- Nemotron terminal scientific computing: Likely enhances its ability to process and generate scientific text and code.
- DCAgent experimental traces: Multiple datasets from DCAgent, including
exp_tas_repetition_penalty_1.05_traces,a1_multifile_composition,exp-gfi-staqc-embedding-mean-filtered-10K_glm_4.7_traces_jupiter,exp_tas_max_episodes_512_traces,dev_set_part1_10k_glm_4.7_traces_jupiter, anda1_repo_scaffold. These datasets suggest a specialization in understanding and generating agent-like behaviors, complex compositions, and code scaffolding. - swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces: Indicates training on code generation and testing scenarios.
- penfever--Kimi-2.5-r2egym_sandboxes-maxeps-32k-10pct: Further reinforces its capabilities in agent environments and potentially reinforcement learning contexts.
Training Details
During training, the model utilized a learning rate of 4e-05, a total batch size of 96 across 96 devices, and was trained for 7 epochs using the AdamW optimizer with a cosine learning rate scheduler. The training was performed using Transformers 4.57.6 and Pytorch 2.9.1+cu130.