laion/glm-4_6-nemo-prism

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Cold

The laion/glm-4_6-nemo-prism is an 8 billion parameter language model, fine-tuned from Qwen/Qwen3-8B. This model is specifically adapted using the penfever/glm-4.6-nemo-prism dataset, suggesting a specialization derived from its training data. With a context length of 32768 tokens, it is designed for tasks benefiting from extensive contextual understanding.

Loading preview...

Model Overview

laion/glm-4_6-nemo-prism is an 8 billion parameter language model, fine-tuned from the base model Qwen/Qwen3-8B. This model has been specifically adapted through training on the penfever/glm-4.6-nemo-prism dataset.

Training Details

The fine-tuning process utilized a learning rate of 4e-05, with a total training batch size of 16 across 8 devices. The training ran for 7 epochs, employing an AdamW optimizer with cosine learning rate scheduling and a warmup ratio of 0.1. The model was trained using Transformers 4.56.1 and Pytorch 2.9.1+cu128.

Key Characteristics

  • Base Architecture: Derived from Qwen3-8B.
  • Parameter Count: 8 billion parameters.
  • Context Length: Supports a context window of 32768 tokens.
  • Fine-tuning Dataset: Specialized training on the penfever/glm-4.6-nemo-prism dataset, indicating potential strengths related to the nature of this data.