Ayansk11/FinSenti-DeepSeek-R1-1.5B

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Ayansk11/FinSenti-DeepSeek-R1-1.5B is a 1.5 billion parameter model from Ayansk11, fine-tuned on DeepSeek's R1 distillation for financial sentiment analysis. It excels at classifying short financial texts (headlines, earnings snippets) into positive, negative, or neutral, providing a reasoning chain for its decision. This model is optimized for financial domain-specific sentiment classification with structured output.

Loading preview...

FinSenti-DeepSeek-R1-1.5B Overview

FinSenti-DeepSeek-R1-1.5B is a 1.5 billion parameter model developed by Ayansk11, specifically fine-tuned for financial sentiment analysis. Built upon DeepSeek's R1 distillation, it leverages a strong reasoning foundation to interpret short financial texts and classify them as positive, negative, or neutral, while also generating a concise reasoning chain.

Key Capabilities

  • Financial Sentiment Classification: Accurately classifies short financial texts (1-3 sentences) such as headlines and earnings snippets.
  • Reasoning Chain Generation: Provides a short explanation for its sentiment classification, enhancing transparency and interpretability.
  • Structured Output: Adheres to a strict <reasoning>...</reasoning><answer>...</answer> format, making its output easily parsable for downstream applications.
  • Efficient Training: Utilizes a two-stage SFT + GRPO recipe, trained on the Ayansk11/FinSenti-Dataset, achieving a mean reward of approximately 3.13 / 4.0 on validation.

Ideal Use Cases

This model is particularly well-suited for:

  • Automated analysis of financial news headlines and short market commentary.
  • Applications requiring explainable financial sentiment classification.
  • Integration into systems that benefit from a structured, parseable sentiment output.

Limitations

It's important to note that the model is not designed for long documents (context capped at 2048 tokens), multi-asset or numerical reasoning, or languages other than English. It provides a three-label output (positive, negative, neutral) and does not possess external background knowledge beyond its base pretraining.