laion/r2egym-unified-316__Qwen3-8B
The laion/r2egym-unified-316__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. It was trained on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--r2egym-unified-316/snapshots/7ca94a2abcbab7f0c392f62ec288691cdea20260_thinking_preprocessed dataset. This model is designed for general language tasks, leveraging its 32768 token context length for processing extensive inputs.
Loading preview...
Model Overview
This model, laion/r2egym-unified-316__Qwen3-8B, is an 8 billion parameter language model built upon the Qwen3-8B architecture developed by Qwen. It has been specifically fine-tuned using the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--r2egym-unified-316/snapshots/7ca94a2abcbab7f0c392f62ec288691cdea20260_thinking_preprocessed dataset.
Training Details
The fine-tuning process involved a learning rate of 4e-05 and utilized 32 devices with a multi-GPU distributed type. A cosine learning rate scheduler with a 0.1 warmup ratio was employed over 7.0 epochs. The training was conducted using Transformers 4.57.6, Pytorch 2.9.1+cu130, Datasets 4.7.0, and Tokenizers 0.22.2.
Key Characteristics
- Base Model: Qwen3-8B
- Parameter Count: 8 billion
- Context Length: 32768 tokens
- Fine-tuning Dataset:
laion/r2egym-unified-316dataset
Potential Use Cases
Given its foundation on the Qwen3-8B model and specific fine-tuning, this model is suitable for general language understanding and generation tasks, particularly those benefiting from its extended context window.