laion/glm46-Toolscale-tasks-traces

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 21, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/glm46-Toolscale-tasks-traces model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It is specifically adapted using the DCAgent/glm46-Toolscale-tasks-traces dataset, suggesting an optimization for tool-use or agent-based task execution. This model is designed for scenarios requiring precise task tracing and interaction within complex environments, leveraging its Qwen3-8B foundation and a 32K context length.

Loading preview...

Overview

This model, glm46-Toolscale-tasks-traces, is an 8 billion parameter language model built upon the Qwen/Qwen3-8B architecture. It has been fine-tuned using the DCAgent/glm46-Toolscale-tasks-traces dataset, indicating a specialized focus on tasks related to tool-use, agent interactions, or tracing complex operational sequences.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B, a robust foundation for general language understanding.
  • Fine-tuning Dataset: DCAgent/glm46-Toolscale-tasks-traces, suggesting a specialization in agent-based tasks and tracing.
  • Parameter Count: 8 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: 32,768 tokens, enabling the processing of extensive input sequences for complex task understanding.

Training Details

The model was trained with a learning rate of 4e-05, a total batch size of 16 (achieved with gradient accumulation), and utilized the AdamW optimizer. Training spanned 7 epochs, employing a cosine learning rate scheduler with a 0.1 warmup ratio. This configuration aims to optimize performance for its specialized fine-tuning data.

Potential Use Cases

Given its fine-tuning on task traces, this model is likely suitable for applications involving:

  • Agentic workflows: Understanding and generating responses for AI agents interacting with tools.
  • Task automation: Interpreting and executing multi-step instructions.
  • Complex system tracing: Analyzing and predicting sequences of actions or events.