XXHStudyHard/EnvScaler-Qwen3-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 8, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

EnvScaler-Qwen3-8B is an 8 billion parameter language model based on Qwen3-8B, developed by XXHStudyHard using the EnvScaler framework. It is specifically trained through Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to excel in tool-interactive agent tasks. This model is optimized for complex scenarios requiring interaction with external tools and environments, making it suitable for advanced agentic applications.

Loading preview...

EnvScaler-Qwen3-8B: Tool-Enhanced Agent Model

EnvScaler-Qwen3-8B is an 8 billion parameter language model built upon the Qwen3-8B (Thinking Mode) architecture. Developed by XXHStudyHard, this model is uniquely designed for tool-interactive agent tasks through the EnvScaler framework.

Key Capabilities & Training:

  • Tool Interaction: Specifically trained to understand and utilize external tools within complex environments.
  • Agentic Behavior: Optimized for scenarios where the model acts as an agent, interacting with its surroundings.
  • Two-Stage Training: Utilizes a robust training methodology:
    • Supervised Fine-Tuning (SFT): Trained on 9,022 trajectories from agent-environment interactions, sourced from EnvScaler-SFT-Traj-9K across 4,684 scenarios and 141 synthesized environments.
    • Reinforcement Learning (RL): Further refined using 2,550 RL scenarios and 50 synthesized environments, leveraging the ROLL framework to enhance performance in dynamic, interactive tasks.

Use Cases:

This model is particularly well-suited for applications requiring language models to perform actions, use tools, and navigate complex, interactive environments. Developers can integrate it with the EnvScaler project for full functionality in agent-based systems.