weizhepei/Qwen2.5-3B-WebArena-Lite-SFT-CoT-QwQ-32B-epoch-10

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 2, 2025Architecture:Transformer Warm

The weizhepei/Qwen2.5-3B-WebArena-Lite-SFT-CoT-QwQ-32B-epoch-10 model is a 3.1 billion parameter language model developed by weizhepei, fine-tuned from Qwen/Qwen2.5-3B-Instruct. It is specifically trained on the WebArena-Lite-SFT-CoT-QwQ-32B dataset using Supervised Fine-Tuning (SFT) with the TRL framework. This model is optimized for tasks related to web environments and complex reasoning, leveraging its specialized training data to enhance performance in these domains.

Loading preview...

Model Overview

This model, weizhepei/Qwen2.5-3B-WebArena-Lite-SFT-CoT-QwQ-32B-epoch-10, is a specialized large language model with 3.1 billion parameters. It is built upon the robust Qwen2.5-3B-Instruct architecture and has undergone Supervised Fine-Tuning (SFT) using the TRL framework.

Key Capabilities

  • Web-centric Task Performance: Fine-tuned on the weizhepei/webarena-lite-SFT-CoT-QwQ-32B dataset, indicating a strong focus on tasks within web environments.
  • Reasoning and Instruction Following: The base Qwen2.5-3B-Instruct model provides a solid foundation for instruction following, which is further enhanced by the specialized SFT.
  • Chain-of-Thought (CoT) Integration: The training dataset name suggests an emphasis on Chain-of-Thought reasoning, which can improve the model's ability to break down and solve complex problems.

Good For

  • Web Automation and Interaction: Ideal for applications requiring understanding and interaction within web interfaces or structured web data.
  • Complex Instruction Following: Suitable for scenarios where detailed, multi-step instructions need to be processed and executed.
  • Research in Web-based LLM Applications: Provides a strong baseline for further experimentation and development in web-centric AI tasks.