Bellatrix-Tiny-1B-v2 by prithivMLmods is a 1 billion parameter auto-regressive language model with a 32768 token context length, based on an optimized transformer architecture. It is instruction-tuned using SFT and RLHF, specifically designed for reasoning-based tasks on the QWQ synthetic dataset. This model excels in multilingual dialogue use cases, including agentic retrieval and summarization tasks, outperforming many open-source alternatives in these areas.
Loading preview...
Bellatrix-Tiny-1B-v2 Overview
Bellatrix-Tiny-1B-v2, developed by prithivMLmods, is a 1 billion parameter auto-regressive language model built on an optimized transformer architecture. It is specifically designed for reasoning-based tasks, having been instruction-tuned using Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) on the QWQ synthetic dataset. The model demonstrates strong performance in multilingual dialogue scenarios, often surpassing other open-source options.
Key Capabilities
- Multilingual Dialogue: Optimized for conversations across multiple languages.
- Agentic Retrieval: Facilitates intelligent information retrieval within dialogue systems.
- Summarization: Efficiently condenses large texts into concise summaries.
- Instruction Following: Capable of adhering to complex, context-aware instructions to generate precise outputs.
Intended Use Cases
- Agentic Retrieval Systems: For intelligent information fetching.
- Text Summarization Tools: To create brief overviews of longer content.
- Multilingual Chatbots and Assistants: Supporting diverse language interactions.
- Instruction-Based Applications: Where precise, context-aware responses are critical.
Limitations
While versatile, Bellatrix-Tiny-1B-v2 has limitations including potential performance degradation on highly specialized datasets, dependence on training data quality, significant computational resource requirements for fine-tuning and inference, and varying language coverage.