Model Overview
YOYO-AI/Qwen3-8B-YOYO-nuslerp is an 8 billion parameter language model developed by YOYO-AI, built upon the Qwen3 architecture. This model is distinguished by its use of the nuslerp merge method, combining deepseek-ai/DeepSeek-R1-0528-Qwen3-8B and Qwen/Qwen3-8B with equal weighting. It supports a substantial context length of 32768 tokens and is configured for high precision, using float32 for internal processing and bfloat16 for output.
Key Features
- Merge Method: Employs the
nuslerp merging technique. - Precision: Utilizes
float32 for internal data types and bfloat16 for output, ensuring high numerical precision. - Context Length: Offers a native context window of 32K tokens, suitable for handling extensive inputs.
- Chat Template: Includes a brand-new chat template designed to ensure compatibility and normal operation on platforms such as LM Studio.
Usage Considerations
This model is part of a family of YOYO-AI Qwen3-8B variants. While 128K context versions exist, the developer recommends using the 32K native context models for optimal quality, as 128K context models may experience slight quality degradation. Default thinking mode parameters are set to Temperature=0.6, TopP=0.95, TopK=20, and MinP=0.