djtony707/synapse-3b
Synapse-3B by djtony707 is a 3.1 billion parameter merged specialist model based on Qwen2.5-3B-Instruct, featuring a 32K context length. It integrates four distinct LoRA adapters (math, code, general, coordinator) using TIES merging to preserve specialized capabilities without catastrophic forgetting. This model is optimized for collaborative inference within the TITAN Synapse engine, excelling in tasks requiring combined mathematical reasoning, code generation, and general instruction following.
Loading preview...
Synapse-3B: A Merged Specialist Model
Synapse-3B is a 3.1 billion parameter model developed by djtony707, designed as a foundational component for the TITAN Synapse ecosystem. It is built upon the robust Qwen2.5-3B-Instruct base model and uniquely combines four specialist LoRA adapters (math, code, general, coordinator) into a single model. This integration is achieved using TIES merging (Trim, Elect Sign, Merge), a technique that minimizes interference between specializations by trimming small deltas and merging only agreeing directions.
Key Capabilities
- Specialized Expertise: Incorporates dedicated adapters for mathematical reasoning (trained on GSM8K, OpenWebMath), code generation (CodeAlpaca, Evol-Instruct), general knowledge (SlimOrca, Alpaca-Cleaned), and task coordination.
- Efficient Merging: Utilizes TIES merging to maintain the distinct strengths of each specialist without significant performance degradation, a common issue in multi-task models.
- Foundation for Synapse Architecture: Designed to work within the brain-inspired Synapse Architecture, which aims to replace monolithic transformers with modular components like Mamba, xLSTM, Sparse MoE, and Fast Weights.
- Multilingual Support: Inherits multilingual capabilities from its Qwen2.5-3B-Instruct base.
Good For
- Applications requiring a blend of mathematical problem-solving, code generation, and general instruction following within a compact 3B parameter footprint.
- Developers interested in modular AI architectures and exploring the TITAN Synapse engine for collaborative, local inference.
- Use cases where resource efficiency is critical, leveraging a smaller model that punches above its weight through specialized training and merging techniques.