Name: laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg-32b-3pct__Qwen3-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/nemosci-tasrep-a1mfc-gfistaqc-dev1-scaff-maxeps-swes-r2eg-32b-3pct__Qwen3-32B, is a 32 billion parameter language model derived from the Qwen3-32B architecture. It has undergone extensive fine-tuning on a unique combination of datasets, indicating a specialization in complex, multi-domain tasks.

Key Training Datasets

The model's training involved several distinct datasets, suggesting a focus on diverse and intricate problem-solving:

Scientific Computing: nemotron-terminal-scientific_computing-3pct
Agent Traces: exp_tas_repetition_penalty_1.05_traces-3pct, exp_tas_max_episodes_512_traces-3pct
Code & Composition: a1_multifile_composition-3pct, exp-gfi-staqc-embedding-mean-filtered-10K_glm_4.7_traces_jupiter-3pct, a1_repo_scaffold-3pct, swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces-3pct
R2E Gym Sandboxes: Kimi-2.5-r2egym_sandboxes-maxeps-32k-3pct

Training Configuration

Training was conducted with a learning rate of 4e-05 over 7 epochs, utilizing a distributed setup across 96 GPUs. The optimizer used was ADAMW_TORCH_FUSED with a cosine learning rate scheduler and a warmup ratio of 0.1. This configuration suggests a robust training process designed to leverage the large model size and diverse datasets effectively.

Overview

Model Overview

Key Training Datasets

Training Configuration

Full Model Card (README)