Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_0

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 16, 2026License:otherArchitecture:Transformer Warm

Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_0 is a 3.1 billion parameter language model, fine-tuned from Qwen/Qwen2.5-3B. This model was specifically fine-tuned on the qwen25_qwen3_rank_only_cluster_0 dataset. It leverages a 32768 token context length and was trained with a learning rate of 1e-05 over 1.0 epochs. Its primary differentiation lies in its specialized fine-tuning for tasks related to ranking within the Qwen2.5 and Qwen3 model families.

Loading preview...

Model Overview

Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_0 is a specialized 3.1 billion parameter language model derived from the Qwen/Qwen2.5-3B base architecture. This model has undergone specific fine-tuning on the qwen25_qwen3_rank_only_cluster_0 dataset, indicating an optimization for tasks involving ranking or preference modeling within the Qwen 2.5 and Qwen 3 ecosystems.

Key Training Details

The model was trained using the following hyperparameters:

  • Base Model: Qwen/Qwen2.5-3B
  • Dataset: qwen25_qwen3_rank_only_cluster_0
  • Learning Rate: 1e-05
  • Epochs: 1.0
  • Batch Size: 4 (train), 8 (eval) with 8 gradient accumulation steps, resulting in a total effective batch size of 128.
  • Optimizer: AdamW_Torch_Fused with cosine learning rate scheduler and 0.1 warmup ratio.
  • Context Length: 32768 tokens

Potential Use Cases

Given its fine-tuning on a ranking-specific dataset, this model is likely best suited for applications requiring:

  • Preference Modeling: Understanding and predicting user preferences or rankings.
  • Comparative Analysis: Tasks where comparing and ordering different outputs or options is crucial.
  • Specialized Evaluation: Potentially useful in evaluating or scoring outputs from other Qwen2.5 or Qwen3 models based on learned ranking criteria.