Name: laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg__Qwen3-8B, is an 8 billion parameter language model derived from the Qwen3-8B base architecture. It has undergone extensive fine-tuning on a specialized collection of datasets, including those focused on scientific computing, multi-file composition, and various agent-based traces. The model is configured with a substantial context length of 32768 tokens, enabling it to process and understand complex, lengthy inputs.

Key Capabilities

Specialized Fine-tuning: Trained on datasets such as nemotron-terminal-scientific_computing, exp_tas_repetition_penalty_1.05_traces, a1_multifile_composition, exp-gfi-staqc-embedding-mean-filtered-10K_glm_4.7_traces_jupiter, exp_tas_max_episodes_512_traces, dev_set_part1_10k_glm_4.7_traces_jupiter, a1_repo_scaffold, swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces, and Kimi-2.5-r2egym_sandboxes-maxeps-32k.
Extended Context Window: Features a 32768 token context length, suitable for tasks requiring deep understanding of long documents or codebases.

Training Details

The model was trained using a learning rate of 4e-05, a total batch size of 96 (with 3 gradient accumulation steps across 32 GPUs), and a cosine learning rate scheduler with a 0.1 warmup ratio over 5 epochs. The optimizer used was ADAMW_TORCH_FUSED with default betas and epsilon.

Good For

Scientific Computing: Tasks involving scientific data analysis, simulation, and problem-solving.
Complex Code Generation & Analysis: Handling multi-file projects, repository scaffolding, and understanding intricate code structures.
Agent-based Reasoning: Applications requiring the model to process and learn from agent interaction traces and decision-making processes.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)