KartCo/innoartM1 is a 0.5 billion parameter causal language model from the Qwen2.5 series, developed by Qwen. This base model features a transformer architecture with RoPE, SwiGLU, and RMSNorm, and supports a context length of 32,768 tokens. It offers significantly improved capabilities in coding, mathematics, instruction following, and generating structured outputs like JSON. Optimized for pretraining, it serves as a foundation for further fine-tuning for specific applications.
Loading preview...
Overview
KartCo/innoartM1 is a 0.5 billion parameter base causal language model from the Qwen2.5 series, developed by Qwen. This model is part of a new generation that builds upon Qwen2, offering enhanced capabilities across several domains. It utilizes a transformer architecture incorporating RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings, with a substantial context length of 32,768 tokens.
Key Capabilities
- Enhanced Knowledge & Reasoning: Significantly improved performance in coding and mathematics, leveraging specialized expert models.
- Instruction Following: Demonstrates substantial improvements in adhering to instructions and generating long texts (over 8K tokens).
- Structured Data Handling: Better understanding of structured data, such as tables, and improved generation of structured outputs, particularly JSON.
- Robustness: More resilient to diverse system prompts, which enhances role-play implementation and condition-setting for chatbots.
- Multilingual Support: Offers support for over 29 languages, including major global languages like Chinese, English, French, Spanish, and more.
When to Use
This 0.5B base model is primarily intended for pretraining and serves as a strong foundation for further development. It is not recommended for direct conversational use without additional post-training steps like Supervised Fine-Tuning (SFT) or Reinforcement Learning from Human Feedback (RLHF). Developers can leverage its improved coding, mathematical, and structured output generation capabilities by fine-tuning it for specific tasks.