Name: activeDap/gemma-2b_hh_harmful API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: activeDap

Model Overview

activeDap/gemma-2b_hh_harmful is a 2.5 billion parameter language model, fine-tuned by activeDap from the original google/gemma-2b base model. Its training specifically utilized the activeDap/sft-harm-data dataset, focusing on supervised fine-tuning (SFT) to influence its response generation.

Key Characteristics

Base Model: Google's Gemma-2b architecture.
Fine-tuning Objective: Trained on a dataset specifically curated for harmful content, suggesting a focus on understanding or generating responses related to such prompts.
Training Details: The model underwent 36 training steps, achieving a final training loss of 2.1243. Training was performed with a batch size of 64 and a learning rate of 2e-05, using a maximum sequence length of 512 tokens.
Framework: Developed using the Transformers and TRL libraries, employing a prompt-completion format with Assistant-only loss.

Potential Use Cases

Research into Harmful Content: Ideal for researchers studying how language models process and respond to harmful or sensitive queries.
Safety and Alignment Studies: Can be used to investigate model behavior in challenging scenarios and develop strategies for safer AI interactions.
Dataset Analysis: Provides a model trained on specific harmful data, which can be useful for analyzing the impact of such datasets on model outputs.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)