laion/exp_tas_top_k_64_traces

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 5, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/exp_tas_top_k_64_traces model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. It was trained on the DCAgent/exp_tas_top_k_64_traces dataset with a context length of 32768 tokens. This model is specialized for tasks related to the specific dataset it was fine-tuned on, likely involving agent traces or similar sequential data.

Loading preview...

Model Overview

This model, laion/exp_tas_top_k_64_traces, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been fine-tuned specifically on the DCAgent/exp_tas_top_k_64_traces dataset, suggesting a specialization in processing or generating data related to agent traces or similar sequential decision-making processes.

Training Details

The fine-tuning process utilized a learning rate of 4e-05, a total training batch size of 16 (with a train_batch_size of 1 and gradient_accumulation_steps of 2), and ran for 7 epochs. The optimizer used was ADAMW_TORCH_FUSED with standard beta values and epsilon. A cosine learning rate scheduler with a 0.1 warmup ratio was employed. The training was distributed across 8 GPUs.

Key Characteristics

  • Base Model: Qwen/Qwen3-8B
  • Parameter Count: 8 billion
  • Context Length: 32768 tokens
  • Specialization: Fine-tuned on DCAgent/exp_tas_top_k_64_traces dataset, indicating potential expertise in agent-based or trace-related tasks.

Intended Use Cases

While specific use cases are not detailed in the provided information, its fine-tuning on a specialized dataset implies suitability for applications involving:

  • Analysis of agent behaviors or traces.
  • Generation of sequences or actions based on observed traces.
  • Tasks requiring understanding or prediction within specific agent environments.