Name: mookiezi/Discord-Micae-Hermes-3-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mookiezi

Overview

Discord-Micae-Hermes-3-3B is a 3.2 billion parameter language model developed by mookiezi, fine-tuned from the NousResearch/Hermes-3-Llama-3.2-3B base model. Its primary focus is on generating casual, human-like dialogue by leveraging a specialized dataset of Discord conversations. The model was trained over 17 days on a GTX 1080, utilizing a LoRA merge fine-tuning method across multiple epochs with varying training schedules for single-turn and multi-turn exchanges.

Key Capabilities

Generates dialogue with a casual, human-like tone.
Supports experimentation with dialogue agents trained on Discord data.
Functions as a base model for natural text generation in video game text-dialogue.
Utilizes the ChatML prompt format, handling context and chat history effectively.

Limitations and Considerations

Inherits potential biases from Discord-style language.
Not safety-aligned for deployment without moderation.
Not intended for factual or sensitive information retrieval, despite inheriting knowledge from its base model.

Training Details

The model was fine-tuned using the mookiezi/Discord-OpenMicae dataset. The training involved a multi-phase schedule, including 17M tokens of single-turn exchanges and 5.5M tokens of multi-turn chains, followed by a combined dataset epoch. It uses torch.optim.AdamW and a Cosine scheduler with warmup steps.