simplescaling/s1.1-1.5B

Warm
Public
1.5B
BF16
32768
Mar 16, 2025
Hugging Face
Overview

Model Overview

The simplescaling/s1.1-1.5B is a 1.5 billion parameter language model, specifically a fine-tuned variant of the Qwen2.5-1.5B-Instruct architecture. It boasts a substantial context length of 131072 tokens, indicating its potential for processing long sequences of text.

Key Characteristics

  • Base Model: Qwen2.5-1.5B-Instruct
  • Parameter Count: 1.5 billion
  • Context Length: 131072 tokens
  • Fine-tuning Dataset: s1K-1.1

Important Considerations

  • Evaluation Status: This model has not been formally evaluated for performance or capabilities.
  • Developer Recommendation: The creators explicitly recommend using their larger s1.1-32B model for general applications, suggesting that this 1.5B version might be experimental or less robust for production use cases.

Given the lack of evaluation and the developer's recommendation for a different model, simplescaling/s1.1-1.5B is best suited for experimental purposes, research into fine-tuning smaller models, or scenarios where a very long context window is critical and performance is not the primary concern.