Antahkarana-7B: Lifelong Learning with Vedic-Derived Architecture

Antahkarana-7B is a 7 billion parameter language model, based on Mistral-7B-v0.1, that addresses the critical problem of catastrophic forgetting in continual learning. Developed by Deepak Soni, this model integrates an AI architecture inspired by the 2,500-year-old Vedic model of mind, the antaḥkaraṇa ("inner instrument"). It allows the model to learn new domains without overwriting previous knowledge and to abstain from answering when unsure, rather than hallucinating.

Key Capabilities & Innovations

Catastrophic Forgetting Mitigation: Achieves approximately 3.8 times less forgetting than naive LoRA, with higher and more stable accuracy across continually learned domains.
Calibrated Abstention (Pramāṇa): Features a confidence gate that enables the model to indicate uncertainty, enhancing trustworthiness in high-stakes applications.
Vedic-Derived Mechanisms: Implements cognitive "faculties" like saṃskāra (Fisher-importance consolidation) and vijñāna-smṛti (dark-knowledge replay) to protect and rehearse past knowledge.
Standalone Full-Weights Model: The continual-learning architecture, initially trained via LoRA, is merged into the base weights, allowing for direct loading without adapters.

Ideal Use Cases

Lifelong Enterprise Models: Continuously absorb new data, policies, or products without expensive retraining or forgetting prior knowledge.
Trustworthy AI: Applications requiring calibrated abstention, such as medical, legal, or financial AI, where "I'm not sure" is crucial.
Label-Efficient Learning: Potential for learning from unlabeled data, significantly reducing annotation costs.
Personal/On-Device AI: Adapters can personalize a frozen base model with privacy preservation and no full retraining.

Overview

Antahkarana-7B: Lifelong Learning with Vedic-Derived Architecture

Key Capabilities & Innovations

Ideal Use Cases

Full Model Card (README)