Name: Qwen/Qwen1.5-4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Qwen

Qwen1.5-4B: A Beta Release of Qwen2

Qwen1.5-4B is a 4 billion parameter model within the Qwen1.5 series, representing a significant update to the original Qwen architecture. Developed by Qwen, this transformer-based, decoder-only language model is pretrained on extensive data and offers several key improvements over its predecessor.

Key Capabilities & Features

Multilingual Support: Both base and chat models are designed with enhanced multilingual capabilities.
Extended Context Length: Provides stable support for a 32K token context window across all model sizes.
Improved Tokenizer: Features an adaptive tokenizer optimized for multiple natural languages and programming codes.
Simplified Usage: No longer requires trust_remote_code, streamlining integration.
Architectural Enhancements: Incorporates SwiGLU activation, attention QKV bias, and group query attention, though GQA and mixed SWA/full attention are temporarily excluded in this beta version.

Recommended Use Cases

This base model is primarily intended for developers and researchers who plan to perform further post-training. It serves as an excellent foundation for:

Supervised Fine-Tuning (SFT): Adapting the model to specific tasks or datasets.
Reinforcement Learning from Human Feedback (RLHF): Aligning the model's behavior with human preferences.
Continued Pretraining: Further training on specialized datasets to enhance domain-specific knowledge.