YU-MO/Yumo-nano: A Bilingual Reasoning Model
Yumo-nano is a 1.5 billion parameter language model developed by YU-MO, fine-tuned from the agentica-org/DeepScaleR-1.5B-Preview base model. It is designed to handle a substantial context length of 32768 tokens, making it suitable for tasks requiring extensive contextual understanding.
Key Capabilities
- Bilingual Proficiency: Trained on both English and Spanish datasets, enabling robust performance in both languages.
- Enhanced Reasoning: Fine-tuned using datasets like
EleutherAI/hendrycks_math, indicating a focus on improving mathematical and general reasoning abilities. - Instruction Following: As an instruction-tuned model, it is adept at following user prompts and generating relevant responses.
- Efficient Fine-tuning: Utilizes
unsloth for efficient fine-tuning, suggesting a streamlined development process.
Good For
- Bilingual Chatbots: Ideal for conversational AI applications that need to operate in both English and Spanish.
- Reasoning Tasks: Suitable for tasks requiring logical deduction, problem-solving, and mathematical understanding.
- General Text Generation: Capable of generating coherent and contextually relevant text for various applications.
- Resource-Efficient Deployment: Its 1.5 billion parameter size makes it a more accessible option compared to larger models, while still offering strong capabilities.