Mxode/NanoLM-0.3B-Instruct-v2
Mxode/NanoLM-0.3B-Instruct-v2 is a 0.3 billion parameter instruction-tuned causal language model developed by Mxode, based on the Qwen2ForCausalLM architecture. With approximately 180 million non-embedding parameters and a 4K sequence length, this model is a smaller variant of Qwen2-0.5B, optimized for strong instruction-following capabilities despite its reduced size. It currently supports English only, making it suitable for resource-constrained applications requiring efficient English instruction processing.
Loading preview...
Overview
NanoLM-0.3B-Instruct-v2 is part of the NanoLM Collections by Mxode, designed to explore the potential of small language models. This specific model is a 0.3 billion parameter instruction-tuned variant, built upon the Qwen2ForCausalLM architecture. It features 12 layers, 896 dimensions, 14 heads, and a 4K sequence length, with approximately 180 million non-embedding parameters.
Key Capabilities
- Efficient Instruction Following: Despite its small size, the model demonstrates strong capabilities in following instructions.
- Qwen2 Architecture: Based on the Qwen2ForCausalLM architecture, specifically a reduced version of Qwen2-0.5B with fewer layers.
- English-only Support: The model is currently optimized for and supports English language tasks exclusively.
Good For
- Resource-constrained environments: Its small parameter count makes it suitable for deployment where computational resources are limited.
- English instruction-based tasks: Excels in scenarios requiring efficient processing of English instructions.
- Exploring small model potential: Ideal for researchers and developers interested in the performance of highly compact language models.