laion/GLM-4_7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 9, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The laion/GLM-4_7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k model is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. It was trained on the DCAgent2/GLM-4.7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k dataset, suggesting a specialization in agent-based or sandbox environments. This model is likely optimized for tasks requiring interaction within defined test or simulation frameworks, leveraging its 32768 token context length.

Loading preview...

Model Overview

This model, laion/GLM-4_7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k, is an 8 billion parameter language model derived from the Qwen/Qwen3-8B architecture. It has been specifically fine-tuned using the DCAgent2/GLM-4.7-swesmith-sandboxes-with_tests-oracle_verified_120s-maxeps-131k dataset.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-8B.
  • Parameter Count: 8 billion parameters.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Training Data: Utilizes a specialized dataset, indicating potential optimization for specific interactive or agent-based tasks.

Training Details

The fine-tuning process involved a learning rate of 4e-05, a total train batch size of 16 (with 8 multi-GPU devices and 2 gradient accumulation steps), and a cosine learning rate scheduler with a 0.1 warmup ratio over 7 epochs. The optimizer used was ADAMW_TORCH_FUSED.

Potential Use Cases

Given its fine-tuning dataset, this model is likely suited for applications involving:

  • Agent-based systems: Interacting within defined environments or simulations.
  • Sandbox testing: Generating or interpreting actions within constrained, verifiable systems.
  • Automated verification: Tasks requiring interaction with oracle-verified outputs over specific timeframes (e.g., 120s).

Further details on specific intended uses and limitations are not provided in the current model description.