yixuantt/Qwen2.5-3B-R1-Finance

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 8, 2025License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

The yixuantt/Qwen2.5-3B-R1-Finance is a 3.1 billion parameter causal language model based on the Qwen2.5 architecture, developed by yixuantt. This model is a toy implementation utilizing CoT-sft with GRPO, indicating an experimental approach to fine-tuning. It is designed for financial applications, leveraging its specialized training methodology to potentially enhance reasoning in this domain.

Loading preview...

Model Overview

The yixuantt/Qwen2.5-3B-R1-Finance is a 3.1 billion parameter causal language model built upon the Qwen2.5 architecture. Developed by yixuantt, this model is presented as a "toy model" that incorporates specific training methodologies: CoT-sft (Chain-of-Thought supervised fine-tuning) and GRPO (likely a form of Reinforcement Learning from Human Feedback or similar optimization). This combination suggests an experimental focus on improving reasoning capabilities, particularly within a specialized domain.

Key Characteristics

  • Architecture: Qwen2.5-based causal language model.
  • Parameter Count: 3.1 billion parameters, making it a relatively compact model.
  • Training Methodology: Utilizes CoT-sft and GRPO, indicating an emphasis on structured reasoning and potentially optimized policy learning.
  • Context Length: Supports a context window of 32768 tokens.

Potential Use Cases

Given its specialized training and the "Finance" designation in its name, this model is likely intended for:

  • Financial Text Analysis: Tasks requiring reasoning over financial documents, reports, or market data.
  • Experimental AI Research: As a "toy model," it serves as a platform for exploring the effectiveness of CoT-sft and GRPO in domain-specific applications.
  • Prototyping: Suitable for developing and testing financial AI applications where a smaller, specialized model is advantageous.