wincentIsMe/Qwen3-0.6B-finetuned-astro-horoscope-fsdp
The wincentIsMe/Qwen3-0.6B-finetuned-astro-horoscope-fsdp model is a 0.8 billion parameter language model based on the Qwen3-0.6B architecture, fine-tuned by wincentIsMe. It is specifically adapted for generating astrological horoscope content, demonstrating a final validation perplexity of 1.2800. This model is optimized for specialized text generation within the domain of horoscopes and astrology.
Loading preview...
Model Overview
This model, wincentIsMe/Qwen3-0.6B-finetuned-astro-horoscope-fsdp, is a fine-tuned variant of the Qwen/Qwen3-0.6B architecture, developed by wincentIsMe. It features approximately 0.8 billion parameters and a context length of 32768 tokens. The model has been specialized through fine-tuning on an undisclosed dataset, achieving a final validation loss of 0.2469 and a perplexity of 1.2800.
Training Details
The training process involved 5 epochs with a learning rate of 0.0002, utilizing a multi-GPU setup with 8 devices. The total training batch size was 512, and an AdamW optimizer was used. Key training results show a consistent reduction in validation loss and perplexity over the epochs:
- Epoch 1: Validation Loss 1.4018, Perplexity 4.0626
- Epoch 5: Validation Loss 0.2469, Perplexity 1.2800
Intended Use Cases
This model is specifically designed for generating content related to astrological horoscopes. Its fine-tuned nature suggests a strong capability in producing text aligned with this niche domain, making it suitable for applications requiring specialized astrological predictions or descriptions.