Run exp-psu-stackoverflow-31K_glm_4_7_traces API | Serverless Inference | 32K Context

Name: laion/exp-psu-stackoverflow-31K_glm_4_7_traces API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Model Overview

This model, laion/exp-psu-stackoverflow-31K_glm_4_7_traces, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned on a dataset identified as /data/cat/ws/befe330h-befe330h-otagent/huggingface/hub/datasets--DCAgent--exp-psu-stackoverflow-31K_glm_4.7_traces/snapshots/5b1d8b21707162015662fa506ad12998155f4ab9_thinking_preprocessed. This specialized training suggests its primary utility lies in tasks related to the Stack Overflow domain or similar technical question-and-answer environments.

Key Characteristics

Base Model: Qwen/Qwen3-8B
Parameter Count: 8 billion
Context Length: 32768 tokens
Fine-tuning Dataset: A dataset derived from Stack Overflow traces, indicating a focus on technical content.

Training Details

The model was trained with a learning rate of 4e-05, a batch size of 1 per device across 8 GPUs (totaling 16 effective batch size with gradient accumulation), and for 7 epochs. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling and a warmup ratio of 0.1. The training utilized Transformers 4.57.6, Pytorch 2.9.0+cu128, Datasets 4.4.1, and Tokenizers 0.22.2.

Potential Use Cases

Given its fine-tuning on Stack Overflow data, this model is likely well-suited for:

Generating responses to technical questions.
Summarizing technical discussions or code snippets.
Assisting with code-related queries or explanations.
Content generation for developer documentation or forums.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)