abideen/gemma-2b-openhermes

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Feb 21, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

abideen/gemma-2b-openhermes is a 2.6 billion parameter language model based on Google's Gemma 2B architecture, fine-tuned using QLoRA on the OpenHermes-2.5 preference dataset. This model is designed for conversational AI and instruction-following tasks, leveraging its fine-tuning to generate coherent and contextually relevant English text responses. It offers a compact yet capable solution for applications requiring efficient natural language understanding and generation.

Loading preview...

Model Overview

abideen/gemma-2b-openhermes is a 2.6 billion parameter variant of the Gemma 2B language model, further fine-tuned on the OpenHermes-2.5 preference dataset using QLoRA. This fine-tuning process enhances its ability to follow instructions and engage in conversational interactions, making it suitable for various dialogue-based applications.

Key Capabilities

  • Instruction Following: Optimized for understanding and responding to user instructions, leveraging the OpenHermes-2.5 dataset.
  • Conversational AI: Designed for generating coherent and contextually appropriate responses in chat-like scenarios.
  • Text Generation: Capable of producing English-language text based on diverse prompts.
  • Efficient Deployment: As a 2.6B parameter model, it offers a balance between performance and computational efficiency.

Evaluation Highlights

The model's performance was evaluated across several benchmarks, including Nous Benchmark (Agieval, GPT4ALL, BigBench, TruthfulQA) and OpenLLM Benchmark. Notable average scores include:

  • Agieval: 24.11
  • GPT4ALL: 40.01
  • BigBench: 44.75
  • TruthfulQA (mc1/mc2): 30.11 / 47.69
  • OpenLLM Average: 73.5% (with MMLU at 37.62% and HellaSwag acc_norm at 62.73%)

Training Details

The model was trained with a learning rate of 5e-07, a total batch size of 8 (micro_batch_size 1, gradient_accumulation_steps 8), and 1300 training steps. It utilized the chatml chat template and was configured with QLoRA for efficient fine-tuning.