arunvpp05/Nexura-Gemma2B
Nexura-Gemma-2B is a 2 billion parameter decoder-only transformer LLM, fine-tuned by arunvpp05 from Google's Gemma-2B base model. It undergoes a two-stage training process involving Supervised Fine-Tuning (SFT) on high-quality instruction datasets and Direct Preference Optimization (DPO) for alignment. This model is optimized for general-purpose text generation and instruction following, excelling in tasks like chat assistance, educational Q&A, and content rewriting, while requiring a strict XML-style instruction format.
Loading preview...
Nexura-Gemma-2B: A Fine-Tuned & DPO-Aligned Gemma-2B Model
Nexura-Gemma-2B is a specialized variant of Google's Gemma-2B, developed by arunvpp05. This 2 billion parameter, decoder-only transformer LLM is distinguished by its two-stage training approach: initial Supervised Fine-Tuning (SFT) on diverse, high-quality instruction datasets (including Alpaca, Dolly-15k, and filtered samples from Lamini, IGN, and UltraChat) followed by Direct Preference Optimization (DPO) for robust alignment. The DPO stage leverages preference datasets like Anthropic HH-RLHF, Stanford SHP, UltraFeedback, and JudgeLM.
Key Capabilities & Features
- Optimized Instruction Following: Trained to adhere strictly to an XML-style instruction format (
<user>{instruction}</user><assistant>{response}) for consistent and stable output. - Lightweight & Efficient: At 2 billion parameters, it offers fast inference, suitable for consumer GPUs (8GB+ VRAM recommended, runs on 4-bit quantized mode).
- Strong Alignment: DPO training ensures clean behavior and stable responses, particularly when the specified prompt format is followed.
- General-Purpose Text Generation: Designed for a wide array of tasks, from chat assistance to content rewriting.
Recommended Use Cases
- Chat assistants and instruction-following applications.
- Educational Q&A and reasoning tasks.
- Coding assistance and content summarization/rewriting.
Limitations
- Requires strict adherence to its XML-style prompt format; deviations may lead to hallucinations.
- Not multilingual and has limited knowledge compared to larger LLMs, with no factual updates post-2023 (inherent Gemma limitation).
This model is licensed under the Gemma License, permitting both research and commercial use with attribution to Google.