Name: simplescaling/s1.1-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: simplescaling

Model Overview

The simplescaling/s1.1-3B is a 3.1 billion parameter language model built upon the Qwen2.5-3B-Instruct architecture. It has been specifically fine-tuned using the s1K-1.1 dataset, indicating a specialized training focus. The model supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating more extensive outputs.

Key Characteristics

Base Architecture: Qwen2.5-3B-Instruct
Parameter Count: 3.1 billion
Context Length: 32768 tokens
Fine-tuning Dataset: s1K-1.1

Important Considerations

Evaluation Status: The model has not been formally evaluated by its creators, meaning its performance characteristics are currently undocumented.
Developer Recommendation: The developers explicitly recommend using the s1.1-32B model from the same series over this 3B variant, suggesting the larger model offers superior performance or broader applicability.

When to Consider Using This Model

Given the developer's recommendation and lack of evaluation, simplescaling/s1.1-3B might be suitable for:

Experimental purposes: Exploring the impact of the s1K-1.1 fine-tuning on the Qwen2.5-3B-Instruct base.
Resource-constrained environments: If the 32B model is too large, this 3B variant could serve as a lighter alternative, though with potentially reduced performance.
Specific research: If your use case directly aligns with the s1K-1.1 dataset's characteristics and you are prepared to conduct your own evaluations.

Overview

Model Overview

Key Characteristics

Important Considerations

When to Consider Using This Model

Full Model Card (README)