laion/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B
The laion/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It was trained on a diverse collection of scientific computing and agent-based trace datasets, including nemotron-terminal-scientific_computing and various DCAgent and penfever datasets. This model is specifically optimized for tasks related to scientific computing, agent behavior analysis, and complex problem-solving within simulated environments, leveraging its 32768 token context length.
Loading preview...
Model Overview
This model, laion/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg__Qwen3-8B, is an 8 billion parameter language model built upon the robust Qwen3-8B architecture. It has been extensively fine-tuned on a specialized collection of datasets, primarily focusing on scientific computing and agent-based interaction traces.
Key Training Datasets
The fine-tuning process utilized several distinct datasets, indicating a focus on specific domains:
/e/data1/datasets/playground/ot-baf/hf_hub/datasets--laion--nemotron-terminal-scientific_computing/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--exp_tas_repetition_penalty_1.05_traces/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--a1_multifile_composition/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--exp_tas_max_episodes_512_traces/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--dev_set_part1_10k_glm_4.7_traces_jupiter/e/data1/datasets/playground/ot-baf/hf_hub/datasets--DCAgent--swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces/e/data1/datasets/playground/ot-baf/hf_hub/datasets--penfever--Kimi-2.5-r2egym_sandboxes-maxeps-32k
These datasets suggest an optimization for tasks involving complex reasoning, scientific problem-solving, and understanding sequential agent actions within simulated or interactive environments. The model was trained with a learning rate of 4e-05, a batch size of 1 (accumulated to 96), and a cosine learning rate scheduler over 5 epochs, utilizing 32 GPUs.