Name: DCAgent/g1_timeout_e1_gpt_long_thinking_tacc-Qwen3-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: DCAgent

Overview

This model, g1_timeout_e1_gpt_long_thinking_tacc-Qwen3-32B, is a specialized fine-tuned version of the Qwen3-32B base model. It has been adapted using a unique dataset, /scratch/08134/negin/hub/datasets--DCAgent--g1_timeout_e1_gpt_long_d1_original_40k_glm47_traces_thinking_preprocessed, which implies a focus on enhancing its capabilities for tasks involving complex reasoning, agentic behavior, or processing 'thinking' traces.

Key Characteristics

Base Model: Qwen/Qwen3-32B, a 32 billion parameter large language model.
Fine-tuning Dataset: Utilizes a specific dataset, suggesting optimization for particular problem-solving or agent-based interaction scenarios.
Context Length: Supports a substantial context window of 32,768 tokens, enabling it to handle long-form inputs and maintain coherence over extended interactions.

Training Details

The fine-tuning process involved a learning rate of 4e-05, a batch size of 1 per device across 32 GPUs (totaling 32), and 7 epochs. The AdamW optimizer with cosine learning rate scheduler and a warmup ratio of 0.1 was employed. This configuration indicates a thorough training regimen aimed at adapting the base model to the nuances of the specialized dataset.

Intended Use Cases

Given its fine-tuning on a dataset related to 'thinking' traces, this model is likely best suited for applications requiring advanced reasoning, simulating thought processes, or tasks within agent-based systems where understanding and generating complex internal monologues or decision-making steps are crucial. Its large context window further supports these applications by allowing for detailed and extended problem descriptions or interaction histories.

Overview

Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)