The schonsense/70B_Triage model is a 70 billion parameter language model created by schonsense using the SCE merge method. It combines several specialized models, including Llama-3.3-70B-Instruct-ftpo_1k, Llama3.1-Aloe-Beta-70B, Palmyra-Med-70B-32K, and IPOplectic. This merged model is designed to leverage the strengths of its constituent parts, particularly for tasks that benefit from a blend of general instruction following and specialized knowledge, indicated by the inclusion of a medical-focused model. Its 32K context length supports processing longer inputs.
Loading preview...
schonsense/70B_Triage: A Merged 70B Language Model
schonsense/70B_Triage is a 70 billion parameter language model developed by schonsense, created through the SCE merge method using mergekit. This model integrates capabilities from multiple specialized base models to achieve a broader range of functionalities.
Key Capabilities
- Advanced Merging Technique: Utilizes the SCE (Selective Channel Expansion) merge method, as detailed in arxiv.org/abs/2408.07990, to combine distinct model strengths.
- Composite Intelligence: Merges four different pre-trained models:
- Extended Context Window: Supports a context length of 32,768 tokens, enabling the processing and generation of longer texts.
- Specialized Knowledge Integration: The inclusion of models like "Palmyra-Med-70B-32K" suggests an emphasis on medical or specialized domain understanding, alongside general instruction-following abilities.
Good For
- Applications requiring a blend of general language understanding and potentially specialized domain knowledge, particularly where medical or scientific text processing is involved.
- Tasks benefiting from a large 70B parameter model with an extended context window for complex queries or document analysis.
- Developers interested in exploring the performance of models created via advanced merging techniques like SCE.