Ayansk11/FinSenti-Qwen3-8B
Ayansk11/FinSenti-Qwen3-8B is an 8.0 billion parameter Qwen3-based model developed by Ayansk11, specifically fine-tuned for financial sentiment analysis. It excels at classifying short financial texts (headlines, earnings snippets) into positive, negative, or neutral categories, providing a reasoning chain for its decisions. This model is optimized for deployment on consumer-grade GPUs with 24GB VRAM or single A100s, offering clear explanations within the Qwen3 family.
Loading preview...
FinSenti-Qwen3-8B: Financial Sentiment Analysis Model
FinSenti-Qwen3-8B is an 8.0 billion parameter model from the Qwen3 family, developed by Ayansk11. It is specifically fine-tuned for financial sentiment analysis of short texts, providing both a sentiment label (positive, negative, neutral) and a concise reasoning chain. This model is part of the FinSenti collection, a scaling study focused on small models trained with a consistent methodology.
Key Capabilities
- Classifies short financial texts (1-3 sentences) into positive, negative, or neutral sentiment.
- Generates a short, readable reasoning chain explaining its sentiment decision.
- Adheres to a strict
<reasoning>...</reasoning><answer>...</answer>output format for easy parsing. - Optimized for news-style headlines and earnings snippets in English.
Training Details
The model underwent a two-stage training process:
- Supervised Fine-Tuning (SFT): Trained on ~15.2K balanced samples from the FinSenti-Dataset, using chain-of-thought targets generated by a teacher model.
- Generative Reinforcement Learning with Policy Optimization (GRPO): Utilized four equally weighted reward functions (sentiment correctness, format compliance, reasoning quality, output consistency), achieving a mean reward of approximately 3.50 / 4.0 on the validation set. Format compliance was near-saturated, ensuring well-formed outputs.
Hardware and Limitations
This model requires approximately 16 GB of VRAM for bf16 weights, making it suitable for 24 GB consumer cards or single A100/H100 GPUs. It is not designed for long documents, multi-asset reasoning, numerical forecasting, or languages other than English. Its output is limited to three hard-cut sentiment labels.