Name: sparkle-reasoning/SparkleRL-7B-Stage1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sparkle-reasoning

SparkleRL-7B-Stage1: RL-Tuned for Mathematical Reasoning

SparkleRL-7B-Stage1 is a 7.6 billion parameter model developed by sparkle-reasoning, serving as the initial Reinforcement Learning (RL) tuned stage described in the paper Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning. This model is specifically engineered to improve mathematical reasoning in large language models through advanced RL techniques.

Key Capabilities & Features

Reinforcement Learning Optimization: Tuned using RL methods to enhance performance on complex mathematical problems.
Mathematical Reasoning Focus: Designed to dissect and improve the reasoning processes of LLMs for mathematical tasks.
Large Context Window: Supports a context length of 131072 tokens, enabling the processing of extensive problem descriptions and solution steps.

Intended Use Cases

Research in Mathematical Reasoning: Ideal for researchers exploring the application of RL to improve LLM capabilities in mathematics.
Development of Math-Solving AI: Suitable for building applications that require robust step-by-step mathematical problem-solving.
Benchmarking RL-tuned Models: Can be used as a baseline or comparison model for evaluating new RL strategies in reasoning tasks.

For more details, refer to the accompanying paper and the project code.

Overview

SparkleRL-7B-Stage1: RL-Tuned for Mathematical Reasoning

Key Capabilities & Features

Intended Use Cases

Full Model Card (README)