laion/syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm_4-7_trac_jupi_32B
The laion/syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm_4-7_trac_jupi_32B is a 32 billion parameter language model fine-tuned from Qwen/Qwen3-32B. It was trained on specific datasets related to 'exp-syh-r2egym-askllm-constrained_glm_4.7_traces_jupiter_cleaned' and 'exp-gfi-swesmith-random-filtered-10K_glm_4.7_traces_jupiter'. This model is likely specialized for tasks related to the data it was fine-tuned on, potentially involving constrained language generation or specific trace analysis, with a context length of 32768 tokens.
Loading preview...
Model Overview
This model, laion/syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm_4-7_trac_jupi_32B, is a 32 billion parameter language model derived from the Qwen/Qwen3-32B base architecture. It has been fine-tuned using two distinct datasets: one related to exp-syh-r2egym-askllm-constrained_glm_4.7_traces_jupiter_cleaned and another from exp-gfi-swesmith-random-filtered-10K_glm_4.7_traces_jupiter. The fine-tuning process involved 7 epochs with a learning rate of 4e-05 and a total training batch size of 32 across 16 GPUs.
Training Details
- Base Model: Qwen/Qwen3-32B
- Parameter Count: 32 billion
- Context Length: 32768 tokens
- Datasets Used:
exp-syh-r2egym-askllm-constrained_glm_4.7_traces_jupiter_cleanedexp-gfi-swesmith-random-filtered-10K_glm_4.7_traces_jupiter
- Key Hyperparameters:
- Learning Rate: 4e-05
- Optimizer: ADAMW_TORCH_FUSED
- Number of Epochs: 7.0
- Gradient Accumulation Steps: 2
Potential Use Cases
Given its fine-tuning on specific trace-related datasets, this model is likely optimized for tasks that involve:
- Processing or generating text based on structured traces.
- Understanding or responding within constrained language environments.
- Applications requiring specialized knowledge derived from the training data's domain.