Name: CorticalStack/gemma-7b-ultrachat-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CorticalStack

Model Overview

CorticalStack/gemma-7b-ultrachat-sft is an 8.5 billion parameter language model derived from Google's Gemma-7B. It has been fine-tuned using Supervised Fine-Tuning (SFT) on the extensive UltraChat dataset, which is designed to improve conversational abilities.

Key Characteristics

Base Model: Fine-tuned from google/gemma-7b.
Training Data: Utilizes the stingning/ultrachat dataset for SFT, focusing on dialogue generation.
Parameter Count: 8.5 billion parameters.
Context Length: Supports a maximum sequence length of 2048 tokens during fine-tuning.

Fine-tuning Details

The model was fine-tuned using LoRA (Low-Rank Adaptation) with specific configurations:

LoRA r: 8
LoRA alpha: 16
LoRA dropout: 0.1

Training involved 1 epoch with a batch size of 4, gradient accumulation steps of 6, and a paged_adamw_32bit optimizer. A constant learning rate of 0.0002 was used over 100 steps.

Ideal Use Cases

This model is particularly well-suited for applications requiring robust conversational AI, such as:

Chatbots and virtual assistants.
Dialogue generation in interactive systems.
Tasks benefiting from human-like conversational responses.

Overview

Model Overview

Key Characteristics

Fine-tuning Details

Ideal Use Cases

Full Model Card (README)