Name: mikankure/gensyn-checkpoints-whistling_howling_scorpion API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mikankure

Model Overview

mikankure/gensyn-checkpoints-whistling_howling_scorpion is a 0.5 billion parameter instruction-tuned language model, building upon the Gensyn/Qwen2.5-1.5B-Instruct architecture. This model distinguishes itself through its training methodology, utilizing GRPO (Gradient-based Reward Policy Optimization), a technique introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300).

Key Capabilities

Enhanced Reasoning: Leverages the GRPO training method to improve reasoning abilities, particularly in mathematical contexts.
Instruction Following: Fine-tuned for responding to user instructions effectively, inherited from its base model.
Extended Context: Features a significant context length of 131072 tokens, allowing for processing and generating longer, more complex texts.

Training Details

The model was fine-tuned using the TRL (Transformer Reinforcement Learning) library, specifically version 0.15.2. The GRPO method, central to its training, aims to push the boundaries of mathematical reasoning in open language models.

Use Cases

This model is well-suited for applications requiring:

Complex Question Answering: Benefits from its enhanced reasoning for intricate queries.
Mathematical Problem Solving: The GRPO training suggests improved performance on tasks involving mathematical logic and computation.
Long-form Content Generation: Its large context window supports generating coherent and contextually relevant long texts.

Overview

Model Overview

Key Capabilities

Training Details

Use Cases

Full Model Card (README)