Name: Locutusque/Mistral-7B-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Locutusque

Model Overview

Locutusque/Mistral-7B-SFT is a 7 billion parameter language model built on the Mistral architecture. Developed by Locutusque, its primary purpose is to serve as a general-purpose assistant while also acting as an experimental platform to determine the most effective datasets for fine-tuning language models.

Training Details

The model underwent a full fine-tuning process utilizing 8 TPU V3s. The specific datasets used for training are listed on the model's page. It is important to note that the model experienced exploding gradients early in its training, which may impact its overall performance.

Key Characteristics

Architecture: Mistral-7B
Parameter Count: 7 Billion
Context Length: 8192 tokens
Training Goal: Dataset efficacy evaluation for fine-tuning

Potential Considerations

Due to the reported exploding gradients during early training, users should be aware that the model's performance may not be fully optimized or guaranteed across all tasks.

Overview

Model Overview

Training Details

Key Characteristics

Potential Considerations

Full Model Card (README)