Name: flammenai/Mahou-1.1-llama3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: flammenai

Mahou-1.1-llama3-8B Overview

Mahou-1.1-llama3-8B is an 8 billion parameter language model from flammenai, built upon the Meta Llama 3 architecture. It is specifically fine-tuned for conversational and roleplay applications, aiming to provide a production-ready solution for generating dynamic and engaging dialogue. The model is designed to be iteratively improved with future versions leveraging flammen.ai's conversational data.

Key Capabilities

Conversational AI: Optimized for natural and extended dialogue generation.
Roleplay Scenarios: Excels at maintaining character consistency and engaging in roleplay interactions.
ChatML Format: Trained to use the ChatML format for structured conversations, ensuring compatibility with common inference setups.
Llama 3 Base: Benefits from the robust capabilities of the underlying Llama 3-8B model.

Training Details

The model was fine-tuned using an A100 GPU on Google Colab, employing Direct Preference Optimization (DPO). The training utilized a LoRA configuration with r=16, lora_alpha=16, and lora_dropout=0.05, targeting key attention and feed-forward modules. It was trained for 420 steps with a learning rate of 3e-5.

Good For

Developing chatbots requiring nuanced conversational abilities.
Creating interactive story-telling or role-playing agents.
Applications where engaging and consistent character dialogue is crucial.

Overview

Mahou-1.1-llama3-8B Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)