Name: moogician/DSR1-Qwen-32B-scg-fixed API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: moogician

Model Overview

The moogician/DSR1-Qwen-32B-scg-fixed is a 32 billion parameter language model, derived from the deepseek-ai/DeepSeek-R1-Distill-Qwen-32B architecture. This model has been specifically fine-tuned on the cwepy10 dataset, indicating a specialization towards the characteristics and tasks inherent in that data. It supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Key Characteristics

Base Model: Fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B.
Parameter Count: 32 billion parameters.
Context Length: 32768 tokens.
Training Data: Fine-tuned on the cwepy10 dataset.
Training Hyperparameters: Utilized a learning rate of 1e-05, a total batch size of 96, and a cosine learning rate scheduler over 6 epochs.

Potential Use Cases

Given its fine-tuning on the cwepy10 dataset, this model is likely best suited for applications that align with the nature and content of that specific dataset. Developers should evaluate its performance on tasks similar to those present in cwepy10 to determine its suitability for their particular use case.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)