Name: URajinda/ShweYon-V3-Base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: URajinda

ShweYon-V3-Base: A Myanmar-Centric Foundation Model

URajinda/ShweYon-V3-Base is a 1.5 billion parameter base language model derived from the Qwen 2.5 architecture, meticulously optimized for the Myanmar language. It represents a significant advancement in the "ShweYon" project by directly integrating Myanmar tokenization into the model's embedding, eliminating the need for a separate tokenizer.

Key Technical Highlights

Integrated Custom Tokenizer: Incorporates over 9,000 Myanmar morphemes and word combinations directly, streamlining Myanmar language processing.
Extended Vocabulary: Features an expanded vocabulary size of 160,746, enabling more compact and efficient computation of Myanmar texts.
Myanmar-Specific Base Training: The model has been extensively trained on a large corpus of Myanmar literary texts to enhance its fundamental understanding and knowledge of the language.

Purpose and Use Cases

This model is designed as a Foundation Base Model for the Myanmar language. It provides a robust starting point for further fine-tuning (SFT/RLHF) to develop various downstream NLP applications, including:

Chatbots
Question Answering systems
Other Myanmar-specific NLP tasks

Note: As a base model, ShweYon-V3-Base requires additional chat fine-tuning for instruction following and human-like conversational capabilities.

Overview

ShweYon-V3-Base: A Myanmar-Centric Foundation Model

Key Technical Highlights

Purpose and Use Cases

Full Model Card (README)