Name: laion/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg-32b__Qwen3-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/nemosci-tasrep-a1mfc-dev1-maxeps-swes-r2eg-32b__Qwen3-32B, is a 32 billion parameter language model derived from the Qwen3-32B architecture. It has undergone extensive fine-tuning on a specialized collection of datasets, indicating a focus on enhancing its performance in particular domains.

Key Fine-tuning Datasets

The model was fine-tuned using several distinct datasets, suggesting an optimization for specific types of tasks:

nemotron-terminal-scientific_computing: Implies a focus on scientific computing and terminal interactions.
exp_tas_repetition_penalty_1.05_traces: Suggests training to manage and reduce repetition in generated outputs.
a1_multifile_composition: Indicates capabilities in handling and composing information from multiple files.
exp_tas_max_episodes_512_traces: Points to training on agent-based task traces with a focus on episode management.
dev_set_part1_10k_glm_4.7_traces_jupiter: Further reinforces training on agent traces, potentially from a development set.
swesmith-sandboxes-with_tests-gpt-5-mini-passed_glm_4.7_traces: Suggests exposure to code sandboxes, tests, and potentially code generation or analysis.
Kimi-2.5-r2egym_sandboxes-maxeps-32k: Reinforces training within sandbox environments, possibly for reinforcement learning or complex task execution.

Training Configuration

The fine-tuning process utilized a learning rate of 4e-05, a train_batch_size of 1, and a total_train_batch_size of 96 across 96 devices. The optimizer used was ADAMW_TORCH_FUSED with a cosine learning rate scheduler and 0.1 warmup ratio over 7 epochs. This configuration suggests a robust training regimen designed to adapt the base Qwen3-32B model to its specialized datasets.

Overview

Model Overview

Key Fine-tuning Datasets

Training Configuration

Full Model Card (README)