ybkim95/gemma-7b-it_invthink

Cold
Public
8.5B
FP8
8192
License: apache-2.0
Hugging Face
Overview

Model Overview

ybkim95/gemma-7b-it_invthink is an 8.5 billion parameter language model, building upon Google's Gemma-7b-it base. Its primary distinction lies in its specialized fine-tuning for AI content safety.

Key Capabilities

  • Safety-Oriented Responses: Trained to generate helpful and appropriate content for safe user prompts.
  • Harmful Content Refusal: Designed to identify and refuse to engage with unsafe, harmful, or inappropriate requests, ensuring content moderation.
  • Balanced Training: Utilizes a "balanced" training mode, incorporating both safe response generation and explicit refusals for unsafe content.

Training Details

This model was fine-tuned using Supervised Fine-Tuning (SFT) on the Nvidia Aegis AI Content Safety Dataset 2.0. This dataset specifically targets the development of robust content safety mechanisms in AI models.

Good For

  • Applications requiring a language model with built-in content safety features.
  • Scenarios where refusing harmful prompts is as critical as generating helpful responses.
  • Developers looking for a Gemma-based model with enhanced safety protocols for user interactions.