iproskurina/qwen-500m-biasinbios-pt-factory-real-base-npacking
The iproskurina/qwen-500m-biasinbios-pt-factory-real-base-npacking model is a 0.5 billion parameter language model, fine-tuned from Qwen/Qwen2.5-0.5B. It has a context length of 32768 tokens and is specifically fine-tuned on the bias_in_bios_pt_train_real dataset. This model is designed for tasks related to bias analysis or specific applications requiring a compact, specialized Qwen 2.5 base.
Loading preview...
Model Overview
The iproskurina/qwen-500m-biasinbios-pt-factory-real-base-npacking model is a compact 0.5 billion parameter language model, derived from the Qwen/Qwen2.5-0.5B architecture. It features a substantial context length of 32768 tokens, making it suitable for processing longer sequences despite its smaller size.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-0.5B, leveraging its foundational capabilities.
- Specialized Fine-tuning: The model has undergone specific fine-tuning on the
bias_in_bios_pt_train_realdataset. This suggests a specialization in tasks related to bias detection, analysis, or mitigation within biographical text data. - Training Configuration: Training involved a learning rate of 5e-05, a batch size of 2 (with 16 gradient accumulation steps for an effective batch size of 32), and a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch.
Potential Use Cases
Given its fine-tuning on a bias-related dataset, this model is likely best suited for:
- Research and development in AI ethics and fairness, particularly concerning bias in biographical information.
- Applications requiring a compact model for specific text analysis tasks where bias detection or mitigation is a factor.
- Experiments with smaller, specialized language models for resource-constrained environments.