laion/glm-4_6-freelancer-32ep-131k-torch

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer Open Weights Cold

The laion/glm-4_6-freelancer-32ep-131k-torch model is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model was trained on the penfever/glm-4.6-freelancer-32ep-131k-torch dataset over 7 epochs with a 32768 token context length. It is designed for general language generation tasks, leveraging the Qwen3 architecture for broad applicability.

Loading preview...

Model Overview

laion/glm-4_6-freelancer-32ep-131k-torch is an 8 billion parameter language model, fine-tuned from the robust Qwen/Qwen3-8B architecture. This model was developed by laion through a fine-tuning process on the penfever/glm-4.6-freelancer-32ep-131k-torch dataset.

Training Details

The model underwent 7 epochs of training, utilizing a learning rate of 4e-05 and a total batch size of 16 (with a train_batch_size of 1 and gradient_accumulation_steps of 2). The optimizer used was ADAMW_TORCH_FUSED with specific beta and epsilon parameters, and a cosine learning rate scheduler with a 0.1 warmup ratio. The training was conducted across 8 GPUs, leveraging Transformers 4.57.3, Pytorch 2.9.0+cu128, Datasets 4.4.1, and Tokenizers 0.22.1.

Intended Use

While specific intended uses and limitations are not detailed in the provided information, as a fine-tuned version of Qwen3-8B, it is generally suitable for a wide range of natural language processing tasks. Developers should consider its base model's capabilities and the fine-tuning dataset's characteristics for specific applications.