Synapse-3B: A Merged Specialist Model
Synapse-3B is a 3.1 billion parameter model developed by djtony707, designed as a foundational component for the TITAN Synapse ecosystem. It is built upon the robust Qwen2.5-3B-Instruct base model and uniquely combines four specialist LoRA adapters (math, code, general, coordinator) into a single model. This integration is achieved using TIES merging (Trim, Elect Sign, Merge), a technique that minimizes interference between specializations by trimming small deltas and merging only agreeing directions.
Key Capabilities
- Specialized Expertise: Incorporates dedicated adapters for mathematical reasoning (trained on GSM8K, OpenWebMath), code generation (CodeAlpaca, Evol-Instruct), general knowledge (SlimOrca, Alpaca-Cleaned), and task coordination.
- Efficient Merging: Utilizes TIES merging to maintain the distinct strengths of each specialist without significant performance degradation, a common issue in multi-task models.
- Foundation for Synapse Architecture: Designed to work within the brain-inspired Synapse Architecture, which aims to replace monolithic transformers with modular components like Mamba, xLSTM, Sparse MoE, and Fast Weights.
- Multilingual Support: Inherits multilingual capabilities from its Qwen2.5-3B-Instruct base.
Good For
- Applications requiring a blend of mathematical problem-solving, code generation, and general instruction following within a compact 3B parameter footprint.
- Developers interested in modular AI architectures and exploring the TITAN Synapse engine for collaborative, local inference.
- Use cases where resource efficiency is critical, leveraging a smaller model that punches above its weight through specialized training and merging techniques.