The jan-hq/Solar-10.7B-SLERP model is a 10.7 billion parameter language model created by jan-hq, utilizing a SLERP merge of Upstage's SOLAR-10.7B-Instruct-v1.0 and janhq/Pandora-v1-10.7B. This model is designed to combine the strengths of its constituent models, offering enhanced performance for general instruction-following tasks within a 4096-token context window. It is particularly suited for local deployment and confidential use cases via platforms like Jan Desktop.
Loading preview...
Model Overview
jan-hq/Solar-10.7B-SLERP is a 10.7 billion parameter language model developed by jan-hq. This model is a product of a SLERP (Spherical Linear Interpolation) merge, combining two high-performing models from the OpenLLM Leaderboard as of December 14th:
- upstage/SOLAR-10.7B-Instruct-v1.0
- janhq/Pandora-v1-10.7B
The base model for this merge is upstage/SOLAR-10.7B-Instruct-v1.0. The SLERP merge method is applied across all 48 layers of the constituent models, with specific parameter weighting for self-attention and MLP layers, and a general weighting for other components. This approach aims to leverage the complementary strengths of both models to create a more robust and capable instruction-following LLM.
Key Capabilities & Features
- Merged Intelligence: Benefits from the combined knowledge and instruction-following abilities of two top-performing 10.7B models.
- SLERP Method: Utilizes a sophisticated merging technique to blend model weights effectively.
- ChatML Prompt Format: Designed to work with the ChatML prompt template for structured conversations.
- Local Deployment: Optimized for running 100% offline on personal machines via Jan Desktop, ensuring privacy and data confidentiality.
Good For
- Confidential Applications: Ideal for use cases where data privacy and offline operation are critical.
- General Instruction Following: Suitable for a wide range of tasks requiring natural language understanding and generation.
- Experimentation with Merged Models: Provides a practical example of the SLERP merging technique for developers interested in model fusion.