bespokelabs/qwen3-8b-dabstep-reasoning-108-fixed-reasoning-sharegpt-sft

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jul 1, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

The bespokelabs/qwen3-8b-dabstep-reasoning-108-fixed-reasoning-sharegpt-sft model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. It was specifically trained on the eval-ds-dabstep-reasoning-108-fixed-reasoning-sharegpt dataset, indicating an optimization for reasoning tasks. With a 32768 token context length, this model is designed for applications requiring robust logical processing and understanding of complex prompts.

Loading preview...

Model Overview

The bespokelabs/qwen3-8b-dabstep-reasoning-108-fixed-reasoning-sharegpt-sft is an 8 billion parameter language model, derived from the Qwen/Qwen3-8B architecture. This model has undergone specific fine-tuning on the eval-ds-dabstep-reasoning-108-fixed-reasoning-sharegpt dataset.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B
  • Parameter Count: 8 billion parameters
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Fine-tuning Focus: The training dataset name suggests a specialization in reasoning capabilities, particularly for tasks involving fixed reasoning and ShareGPT-style interactions.

Training Details

The model was trained using the following key hyperparameters:

  • Learning Rate: 1e-05
  • Optimizer: ADAMW_TORCH
  • Epochs: 5.0
  • Batch Size: A total training batch size of 8 across 8 devices.

Potential Use Cases

Given its fine-tuning on a reasoning-focused dataset, this model is likely well-suited for:

  • Complex problem-solving and logical deduction tasks.
  • Applications requiring robust understanding and generation of reasoned responses.
  • Scenarios benefiting from a large context window for detailed analysis.