Name: bespokelabs/Bespoke-Stratos-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: bespokelabs

Model Overview

Bespoke-Stratos-7B is a 7.6 billion parameter language model developed by bespokelabs, fine-tuned from the Qwen2.5-7B-Instruct architecture. Its primary differentiation lies in its enhanced mathematical reasoning and problem-solving capabilities, achieved through fine-tuning on the Bespoke-Stratos-17k dataset. This dataset was derived by distilling DeepSeek-R1 using a modified data pipeline from Berkeley NovaSky’s Sky-T1.

Key Capabilities & Performance

The model shows significant improvements over its base model, Qwen2.5-7B-Instruct, across several challenging benchmarks:

AIME2024: Achieves 20.0, doubling the performance of Qwen2.5-7B-Instruct (10.0).
MATH500: Scores 82.0, outperforming Qwen2.5-7B-Instruct (74.2).
GPQA-Diamond: Reaches 37.8, an improvement over Qwen2.5-7B-Instruct (33.3).
LiveCodeBench v2 (All): Scores 36.1, surpassing Qwen2.5-7B-Instruct (31.9).

Training Details

Bespoke-Stratos-7B was trained for 7 hours using 8xH100 GPUs. Key hyperparameters included a learning rate of 1e-05, a total train batch size of 96, and 3 epochs, utilizing a cosine learning rate scheduler with a 0.1 warmup ratio. The model is released under the Apache 2.0 License.

Overview

Model Overview

Key Capabilities & Performance

Training Details

Full Model Card (README)