Name: Qwen/Qwen1.5-1.8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Qwen

Qwen1.5-1.8B Model Overview

Qwen1.5-1.8B is part of the Qwen1.5 series, a beta release for Qwen2, developed by Qwen. This transformer-based decoder-only language model features 1.8 billion parameters and is pretrained on extensive data. It incorporates a stable 32K context length across all model sizes and an improved tokenizer designed for multiple natural languages and code.

Key Capabilities & Improvements

Multilingual Support: Both base and chat models offer enhanced multilingual capabilities.
Stable Context Length: Consistently supports a 32K token context window.
Architecture: Based on the Transformer architecture, utilizing SwiGLU activation and attention QKV bias.
Ease of Use: Does not require trust_remote_code for deployment.

Recommended Use Cases

This base language model is not advised for direct text generation. Instead, it is optimized for developers to apply further post-training techniques such as:

Supervised Fine-Tuning (SFT)
Reinforcement Learning from Human Feedback (RLHF)
Continued Pretraining

For more details, refer to the Qwen1.5 GitHub repository.