laion/gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Warm

The laion/gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1 model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It was trained on the penfever/gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1 dataset, suggesting a specialization in areas related to Stack Overflow content. With a context length of 32768 tokens, this model is likely optimized for processing and generating technical information, potentially for code-related tasks or question-answering based on programming knowledge.

Loading preview...

Model Overview

This model, laion/gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1, is an 8 billion parameter language model built upon the Qwen/Qwen3-8B architecture. It has been fine-tuned using the penfever/gpt-oss-120B-stack-overflow-32ep-131k-summtrc-fixthink1 dataset, indicating a specialized focus on content derived from Stack Overflow.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B
  • Parameter Count: 8 billion parameters
  • Context Length: 32768 tokens, allowing for extensive input and output sequences.
  • Training Data: Fine-tuned on a dataset specifically related to Stack Overflow, suggesting proficiency in technical Q&A, code snippets, and programming discussions.

Training Details

The model was trained with a learning rate of 4e-05, using a cosine learning rate scheduler with a 0.1 warmup ratio over 7 epochs. It utilized a distributed training setup across 8 GPUs with a total batch size of 16 (gradient accumulation steps of 2).

Potential Use Cases

Given its fine-tuning on Stack Overflow data, this model is likely well-suited for:

  • Technical Question Answering: Providing answers to programming-related queries.
  • Code Assistance: Generating or explaining code snippets.
  • Developer Support: Assisting with debugging or understanding technical concepts.

Further information regarding specific capabilities, intended uses, and limitations is not detailed in the provided model card.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p