Name: rghosh8/gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-4_merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rghosh8

Model Overview

This model, gsm8k-deepseek-r1-distill-qwen-1.5b-rajat-seed-3407-G-4_merged, is a 1.5 billion parameter language model derived from the DeepSeek-R1-Distill-Qwen-1.5B architecture. It has been further fine-tuned with a focus on enhancing its mathematical reasoning capabilities.

Key Characteristics

Base Model: Built upon deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.
Fine-tuning: Optimized using the GRPO method on the GSM8K dataset, which is a benchmark for grade school math word problems.
Parameter Count: Features 1.5 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens.

Use Cases

This model is particularly well-suited for applications requiring:

Mathematical Problem Solving: Excels in tasks related to arithmetic and logical reasoning, as demonstrated by its fine-tuning on GSM8K.
Efficient Inference: Its 1.5B parameter size makes it suitable for scenarios where computational resources are a consideration, while still providing strong performance in its specialized domain.
Educational Tools: Can be integrated into systems designed to assist with or generate mathematical questions and solutions.

Overview

Model Overview

Key Characteristics

Use Cases

Full Model Card (README)