Adanato/llama3_8b_instruct_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4
Adanato/llama3_8b_instruct_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4 is an 8 billion parameter instruction-tuned language model, fine-tuned from Meta-Llama-3-8B-Instruct. This model was specifically trained on the qwen25_qwen3_rank_only_cluster_4 dataset, suggesting a specialization in tasks related to ranking or comparative evaluation, potentially leveraging insights from Qwen models. Its primary use case is likely in applications requiring nuanced ranking capabilities or performance evaluation within specific domains.
Loading preview...
Overview
This model, named Adanato/llama3_8b_instruct_qwen25_qwen3_rank_only-qwen25_qwen3_rank_only_cluster_4, is an 8 billion parameter instruction-tuned variant derived from the Meta-Llama-3-8B-Instruct base model. It has undergone specific fine-tuning on the qwen25_qwen3_rank_only_cluster_4 dataset.
Key Capabilities
- Instruction Following: Inherits strong instruction-following capabilities from its Llama-3-8B-Instruct base.
- Specialized Ranking: Fine-tuning on the
qwen25_qwen3_rank_only_cluster_4dataset indicates a potential specialization in tasks involving ranking or comparative analysis, possibly informed by Qwen model characteristics.
Training Details
The model was trained with a learning rate of 1e-05, a total training batch size of 128 (achieved with train_batch_size: 4 and gradient_accumulation_steps: 8 across 4 GPUs), and a cosine learning rate scheduler with a 0.1 warmup ratio. Training was conducted for 1 epoch using the AdamW_TORCH_FUSED optimizer.
Good for
- Applications requiring fine-grained ranking or comparative evaluation.
- Tasks where performance insights from Qwen models are beneficial.
- Scenarios needing an 8B parameter model with enhanced instruction-following and specialized ranking abilities.