Name: DCAgent/g1_gptlong_top8_8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: DCAgent

Model Overview

DCAgent/g1_gptlong_top8_8b is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model has been specifically adapted through fine-tuning on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--g1_min_episodes_e1_gpt_long_top8_glm47_traces dataset, indicating an optimization for tasks involving extended context lengths or complex multi-turn interactions.

Key Training Details

Base Model: Qwen/Qwen3-8B
Learning Rate: 4e-05
Optimizer: AdamW Torch Fused with betas=(0.9, 0.98) and epsilon=1e-08
Epochs: 7.0
Batch Size: A total training batch size of 96 was achieved using a train_batch_size of 1 and gradient_accumulation_steps of 2 across 48 devices.

Intended Use Cases

While specific intended uses are not detailed in the provided README, the fine-tuning on a "gpt_long_top8" dataset suggests its suitability for applications that benefit from processing and generating content within a 32K context window. This could include:

Long-form content generation: Summarizing or creating extensive documents.
Complex dialogue systems: Maintaining coherence and context over many turns.
Code analysis or generation: Handling larger codebases or detailed specifications.
Advanced reasoning tasks: Where understanding intricate relationships across a broad text is crucial.

Overview

Model Overview

Key Training Details

Intended Use Cases

Full Model Card (README)