laion/syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm_4-7_trac_jupi_32B

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Mar 4, 2026License:otherArchitecture:Transformer Cold

The laion/syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm_4-7_trac_jupi_32B is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. It was trained on specific datasets related to 'exp-syh-r2egym-askllm-constrained_glm_4.7_traces_jupiter_cleaned' and 'exp-gfi-swesmith-random-filtered-10K_glm_4.7_traces_jupiter'. This model is likely specialized for tasks related to the data it was fine-tuned on, potentially involving constrained language generation or specific trace analysis, with a context length of 32768 tokens.

Loading preview...

Model Overview

This model, laion/syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm_4-7_trac_jupi_32B, is a 32 billion parameter language model derived from the Qwen/Qwen3-32B base architecture. It has been fine-tuned using two distinct datasets: one related to exp-syh-r2egym-askllm-constrained_glm_4.7_traces_jupiter_cleaned and another from exp-gfi-swesmith-random-filtered-10K_glm_4.7_traces_jupiter. The fine-tuning process involved 7 epochs with a learning rate of 4e-05 and a total training batch size of 32 across 16 GPUs.

Training Details

  • Base Model: Qwen/Qwen3-32B
  • Parameter Count: 32 billion
  • Context Length: 32768 tokens
  • Datasets Used:
    • exp-syh-r2egym-askllm-constrained_glm_4.7_traces_jupiter_cleaned
    • exp-gfi-swesmith-random-filtered-10K_glm_4.7_traces_jupiter
  • Key Hyperparameters:
    • Learning Rate: 4e-05
    • Optimizer: ADAMW_TORCH_FUSED
    • Number of Epochs: 7.0
    • Gradient Accumulation Steps: 2

Potential Use Cases

Given its fine-tuning on specific trace-related datasets, this model is likely optimized for tasks that involve:

  • Processing or generating text based on structured traces.
  • Understanding or responding within constrained language environments.
  • Applications requiring specialized knowledge derived from the training data's domain.