arunvpp05/Nexura-Gemma2B

Cold
Public
2.5B
BF16
8192
License: gemma-model-license
Hugging Face
Overview

Nexura-Gemma-2B: A Fine-Tuned & DPO-Aligned Gemma-2B Model

Nexura-Gemma-2B is a specialized variant of Google's Gemma-2B, developed by arunvpp05. This 2 billion parameter, decoder-only transformer LLM is distinguished by its two-stage training approach: initial Supervised Fine-Tuning (SFT) on diverse, high-quality instruction datasets (including Alpaca, Dolly-15k, and filtered samples from Lamini, IGN, and UltraChat) followed by Direct Preference Optimization (DPO) for robust alignment. The DPO stage leverages preference datasets like Anthropic HH-RLHF, Stanford SHP, UltraFeedback, and JudgeLM.

Key Capabilities & Features

  • Optimized Instruction Following: Trained to adhere strictly to an XML-style instruction format (<user>{instruction}</user><assistant>{response}) for consistent and stable output.
  • Lightweight & Efficient: At 2 billion parameters, it offers fast inference, suitable for consumer GPUs (8GB+ VRAM recommended, runs on 4-bit quantized mode).
  • Strong Alignment: DPO training ensures clean behavior and stable responses, particularly when the specified prompt format is followed.
  • General-Purpose Text Generation: Designed for a wide array of tasks, from chat assistance to content rewriting.

Recommended Use Cases

  • Chat assistants and instruction-following applications.
  • Educational Q&A and reasoning tasks.
  • Coding assistance and content summarization/rewriting.

Limitations

  • Requires strict adherence to its XML-style prompt format; deviations may lead to hallucinations.
  • Not multilingual and has limited knowledge compared to larger LLMs, with no factual updates post-2023 (inherent Gemma limitation).

This model is licensed under the Gemma License, permitting both research and commercial use with attribution to Google.