DCAgent/exp_tas_presence_penalty_0_25_traces

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 4, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

DCAgent/exp_tas_presence_penalty_0_25_traces is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model is specifically adapted using the DCAgent/exp_tas_presence_penalty_0.25_traces dataset, suggesting an optimization for tasks related to agentic behavior or specific trace analysis with a presence penalty. It features a 32K context length, making it suitable for processing longer sequences of text in its specialized domain.

Loading preview...

Model Overview

DCAgent/exp_tas_presence_penalty_0_25_traces is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B architecture. This model has been specialized through training on the DCAgent/exp_tas_presence_penalty_0.25_traces dataset. While specific details on its intended uses and limitations are not fully provided, the dataset name suggests a focus on agent-based tasks, potentially involving trace analysis or scenarios where a 'presence penalty' mechanism is relevant.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-8B.
  • Parameter Count: 8 billion parameters.
  • Context Length: Supports a context window of 32,768 tokens.
  • Training Data: Specialized on the DCAgent/exp_tas_presence_penalty_0.25_traces dataset.

Training Details

The model was trained with a learning rate of 4e-05, using an AdamW optimizer with specific beta and epsilon values. Training involved a total batch size of 16 across 8 GPUs, with 2 gradient accumulation steps, over 7 epochs. A cosine learning rate scheduler with a 0.1 warmup ratio was employed.

Potential Use Cases

Given its fine-tuning on a specialized dataset, this model is likely best suited for applications that align with:

  • Agentic Systems: Tasks requiring intelligent agent behavior or decision-making.
  • Trace Analysis: Processing and understanding sequential data or execution traces.
  • Specific Penalty Mechanisms: Scenarios where a 'presence penalty' (a common technique in LLM generation to discourage repetition) is a critical factor in the desired output or behavior.