Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 16, 2026License:otherArchitecture:Transformer Warm

Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_5 is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B. This model was specifically fine-tuned on the qwen25_qwen3_rank_only_cluster_5 dataset, suggesting a specialization in ranking tasks or specific data clusters. It operates with a context length of 32768 tokens, making it suitable for applications requiring processing of moderately long inputs. Its primary utility lies in tasks aligned with its specific fine-tuning dataset.

Loading preview...

Model Overview

This model, named Adanato/qwen25_3b_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_5, is a fine-tuned variant of the Qwen2.5-3B base model developed by Qwen. It features approximately 3.1 billion parameters and supports a substantial context length of 32768 tokens.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen2.5-3B.
  • Fine-tuning Dataset: Specifically trained on the qwen25_qwen3_rank_only_cluster_5 dataset, indicating a potential specialization in ranking or cluster-specific tasks.
  • Training Hyperparameters: The fine-tuning process utilized a learning rate of 1e-05, a batch size of 4, and an AdamW optimizer with a cosine learning rate scheduler over 1 epoch.

Potential Use Cases

Given its fine-tuning on a specific ranking-oriented dataset, this model is likely best suited for:

  • Applications requiring ranking capabilities based on the characteristics of the qwen25_qwen3_rank_only_cluster_5 dataset.
  • Tasks involving processing and understanding data within specific clusters as defined by its training data.
  • Scenarios where a 3.1 billion parameter model with a large context window is beneficial for specialized language understanding.