rroshann/sec-sentiment-sftgrpo-deepseek-14b
The rroshann/sec-sentiment-sftgrpo-deepseek-14b is a 14.8 billion parameter DeepSeek-R1-Distill-Qwen-14B model, fine-tuned using Group Relative Policy Optimization (GRPO) for 5-class sentiment classification of thematic factors from U.S. industrials SEC filings. Developed by rroshann as part of an AllianceBernstein × Vanderbilt DSI capstone project, it specializes in financial materiality sentiment, providing ordinal labels, natural-language rationales, and confidence scores. This model is optimized for cohort-level ordinal ordering of predictions rather than per-sample accuracy, making it suitable for aggregating factor-level signals into filing-level insights for portfolio construction.
Loading preview...
Model Overview
This model, rroshann/sec-sentiment-sftgrpo-deepseek-14b, is a 14.8 billion parameter DeepSeek-R1-Distill-Qwen-14B architecture. It has been specifically aligned using Group Relative Policy Optimization (GRPO) to perform 5-class sentiment classification on thematic factors extracted from U.S. industrials SEC filings (10-K, 10-Q). The training involved a two-stage process: an initial supervised fine-tune (SFT) followed by GRPO alignment against a composite ordinal-plus-anti-neutral reward, with gold labels derived from realized-return quintiles.
Key Capabilities
- Specialized Sentiment Analysis: Classifies financial-materiality sentiment for individual factor summaries from SEC filings into
very_negative,negative,neutral,positive,very_positivelabels. - Output Format: Provides a JSON output including the predicted label, a natural-language rationale, and a confidence score.
- Cohort-Level Optimization: Designed for scenarios where the cohort-level ordinal ordering of predictions is more critical than per-sample accuracy, demonstrating significant lifts in portfolio-level metrics (e.g., L/S cohort spread, Information Ratio) over its SFT predecessor.
- Best-of-N Decoding: Supports a
sft_grpo_bonvariant via Self-Consistency Best-of-N decoding at inference time, which further enhances portfolio-level performance at longer horizons without requiring separate weights.
Intended Use Cases
- Financial Research: Ideal for academic or institutional research focused on extracting sentiment signals from U.S. industrials SEC filings for quantitative finance applications.
- Portfolio Construction Support: Suitable for use within an aggregation layer that combines factor-level sentiment into filing-level signals for portfolio construction, particularly where ordinal ranking of cohorts is paramount.
Limitations
- Domain Specificity: Strictly intended for U.S. industrials SEC filings; not a general-purpose assistant or suitable for sentiment analysis outside this domain.
- Per-Sample Accuracy: While strong at the portfolio level, per-sample F1 gain over the SFT predecessor is minimal.
- Research Use Only: Predictions are for research and reproducibility; not investment advice or audited for regulated deployment.