Ayansk11/FinSenti-Qwen3-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 9, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Ayansk11/FinSenti-Qwen3-8B is an 8.0 billion parameter Qwen3 variant, fine-tuned by Ayansk11 for financial sentiment analysis. This model excels at classifying short financial texts (headlines, earnings snippets) into positive, negative, or neutral, providing a concise reasoning chain. It is optimized for deployment on hardware with 24 GB VRAM, offering clear explanations within the Qwen3 family.

Loading preview...

FinSenti-Qwen3-8B: Financial Sentiment Analysis with Reasoning

Ayansk11/FinSenti-Qwen3-8B is an 8.0 billion parameter model from the Qwen3 family, specifically fine-tuned for financial sentiment analysis. It is part of the FinSenti collection, a scaling study focused on small models trained with a consistent methodology.

Key Capabilities

  • Classifies short financial text (1-3 sentences) into positive, negative, or neutral sentiment.
  • Generates a short reasoning chain explaining its sentiment decision.
  • Adheres to a strict <reasoning>...</reasoning><answer>...</answer> output format for easy parsing.
  • Optimized for news-style headlines and earnings snippets in English.

Training and Performance

The model was trained using a two-stage recipe: Supervised Fine-Tuning (SFT) on ~15.2K samples from the FinSenti Dataset, followed by Generative Reinforcement Learning from Human Feedback (GRPO). GRPO utilized four equally weighted reward functions (sentiment correctness, format compliance, reasoning quality, output consistency), achieving a mean reward of approximately 3.50 / 4.0 on the validation set. It was trained on an A100 80GB GPU, and the merged LoRA adapters mean it doesn't require PEFT to load.

When to Use This Model

This model is ideal for applications requiring financial sentiment classification of short English texts with an interpretable reasoning step. It runs efficiently on hardware with 24 GB VRAM (e.g., a single A100 or high-end consumer card). However, it is not designed for long documents, multi-asset reasoning, numerical forecasting, non-English languages, or scenarios requiring background knowledge beyond its base pretraining.