Model Overview
This model, laion/Kimi-K2T-neulab-agenttuning-mind2web-sandboxes-maxeps-32k, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned for applications involving agent tuning within the Mind2Web sandboxes environment.
Key Training Details
The model was fine-tuned using the penfever/Kimi-K2T-neulab-agenttuning-mind2web-sandboxes-maxeps-32k_neulab-agenttuning-db-sandboxes dataset. Training involved a learning rate of 4e-05, a total batch size of 16 (with gradient accumulation steps of 2), and utilized the AdamW_Torch_Fused optimizer. The training procedure spanned 7 epochs with a cosine learning rate scheduler and a warmup ratio of 0.1. The model leverages a 32,768 token context length, which is beneficial for processing longer sequences relevant to complex agent interactions.
Intended Use Cases
Given its fine-tuning on a specialized dataset, this model is primarily intended for:
- Agent tuning: Developing and refining AI agents, particularly within the Mind2Web framework.
- Sandbox environments: Tasks requiring interaction and learning within simulated or sandboxed web environments.
- Long context processing: Applications that benefit from a 32k token context window for understanding complex instructions or histories.