laion/Kimi-K2T-neulab-agenttuning-webshop-sandboxes-maxeps-32k

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 20, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/Kimi-K2T-neulab-agenttuning-webshop-sandboxes-maxeps-32k model is a fine-tuned 8 billion parameter language model based on Qwen/Qwen3-8B. It was specifically adapted using the penfever/Kimi-K2T-neulab-agenttuning-webshop-sandboxes-maxeps-32k dataset, suggesting an optimization for agent-based tasks within webshop sandbox environments. With a context length of 32,768 tokens, this model is designed for processing extensive conversational or transactional histories relevant to its specialized fine-tuning domain.

Loading preview...

Model Overview

This model, laion/Kimi-K2T-neulab-agenttuning-webshop-sandboxes-maxeps-32k, is a specialized large language model built upon the Qwen/Qwen3-8B architecture. It features 8 billion parameters and supports a substantial context window of 32,768 tokens, enabling it to process lengthy inputs and maintain conversational coherence over extended interactions.

Key Capabilities

  • Specialized Fine-tuning: The model has been fine-tuned on the penfever/Kimi-K2T-neulab-agenttuning-webshop-sandboxes-maxeps-32k dataset. This indicates a focus on tasks related to agent behavior, particularly within webshop sandbox environments, suggesting proficiency in understanding and generating responses for simulated e-commerce interactions.
  • Extended Context: Its 32k token context length is beneficial for applications requiring the model to recall and utilize information from long dialogues or complex scenarios, such as multi-turn agent interactions or detailed product inquiries.

Training Details

The model underwent 7 epochs of training with a learning rate of 4e-05, utilizing a total batch size of 16 across 8 GPUs. The optimizer used was ADAMW_TORCH_FUSED with cosine learning rate scheduling and a 0.1 warmup ratio. These parameters suggest a robust training regimen aimed at optimizing performance within its target domain.