Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 16, 2026License:otherArchitecture:Transformer Warm

Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4 is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B. This model has been specifically fine-tuned on the qwen25_qwen3_rank_only_cluster_4 dataset, suggesting an optimization for tasks related to ranking or specific data clusters. It is designed for applications requiring a compact yet specialized model with a 32768 token context length.

Loading preview...

Model Overview

This model, Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4, is a fine-tuned variant of the Qwen/Qwen2.5-3B base model, developed by Qwen. It features approximately 3.1 billion parameters and supports a 32768 token context length, making it suitable for tasks requiring moderate context understanding.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen2.5-3B.
  • Fine-tuning Dataset: Specifically trained on the qwen25_qwen3_rank_only_cluster_4 dataset, indicating a specialization in ranking-related tasks or performance within particular data clusters.

Training Details

The model was trained using the following hyperparameters:

  • Learning Rate: 1e-05
  • Optimizer: AdamW_Torch_Fused with betas=(0.9, 0.999) and epsilon=1e-08.
  • Batch Size: A total training batch size of 128 (train_batch_size: 4, gradient_accumulation_steps: 8, num_devices: 4).
  • Epochs: Trained for 1.0 epoch.
  • Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.

Potential Use Cases

Given its fine-tuning on a specific ranking dataset, this model is likely optimized for:

  • Tasks involving ranking or preference prediction.
  • Applications within the specific data domain of the qwen25_qwen3_rank_only_cluster_4 dataset.

Further details on specific intended uses and limitations are not provided in the original model card.