Name: Nero10578/Mistral-7B-Sunda-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Nero10578

Model Overview

Nero10578/Mistral-7B-Sunda-v1.0 is a 7 billion parameter language model based on Mistral-7B-v0.1, fine-tuned to incorporate Sundanese language capabilities. This project aimed to demonstrate that a model can acquire a new language not inherently supported, even with limited datasets and QLoRA fine-tuning.

Key Capabilities

Sundanese Language Support: The primary feature is the addition of Sundanese language understanding and generation, enabling the model to respond in Sundanese.
Qlora Fine-tuning: Utilizes QLoRA for efficient adaptation, demonstrating effective language transfer with constrained resources.
Base Model Strength: Leverages the robust architecture of Mistral-7B-v0.1, providing a strong foundation for its extended linguistic abilities.

Training Details

The model was fine-tuned using a cleaned and deduplicated Sundanese corpus derived from the w11wo/nlp-datasets repository. Training involved 2 epochs with specific hyperparameters, including a sequence length of 1024, lora_r of 8, and learning_rate of 0.0002, optimized through trial and error.

Use Cases

This model is particularly well-suited for applications requiring:

Sundanese Chatbots: Engaging in conversations and providing information in Sundanese.
Language Translation (Sundanese): Assisting with basic translation tasks involving Sundanese.
Educational Tools: Supporting learning or content generation in the Sundanese language.