laion/r2egym-unified-3160__Qwen3-8B
The laion/r2egym-unified-3160__Qwen3-8B is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. It was trained on the laion/r2egym-unified-3160 dataset, suggesting a specialization in areas related to the dataset's content. With a 32K context length, it is suitable for tasks requiring extensive contextual understanding.
Loading preview...
Model Overview
This model, laion/r2egym-unified-3160__Qwen3-8B, is an 8 billion parameter language model derived from the Qwen3-8B architecture by Qwen. It has been specifically fine-tuned on the /e/data1/datasets/playground/ot/hf_hub/datasets--laion--r2egym-unified-3160/snapshots/71ad17eb828b8f58fcd7ba2258c57cdc63ae6be5_thinking_preprocessed dataset. This fine-tuning process involved a learning rate of 4e-05, a total batch size of 96, and was trained for 7 epochs using a cosine learning rate scheduler with a 0.1 warmup ratio.
Key Training Details
- Base Model: Qwen/Qwen3-8B
- Parameter Count: 8 Billion
- Context Length: 32,768 tokens
- Fine-tuning Dataset: laion/r2egym-unified-3160
- Optimizer: ADAMW_TORCH_FUSED
- Epochs: 7.0
Intended Use
While specific intended uses and limitations are not detailed in the provided README, its fine-tuning on a specialized dataset suggests potential applications within the domain covered by laion/r2egym-unified-3160. Developers should evaluate its performance on tasks relevant to this dataset to determine suitability.