Mxode/NanoLM-0.3B-Instruct-v2

TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Sep 7, 2024License:gpl-3.0Architecture:Transformer Open Weights Cold

Mxode/NanoLM-0.3B-Instruct-v2 is a 0.3 billion parameter instruction-tuned causal language model developed by Mxode, based on the Qwen2ForCausalLM architecture. With approximately 180 million non-embedding parameters and a 4K sequence length, this model is a smaller variant of Qwen2-0.5B, optimized for strong instruction-following capabilities despite its reduced size. It currently supports English only, making it suitable for resource-constrained applications requiring efficient English instruction processing.

Loading preview...

Overview

NanoLM-0.3B-Instruct-v2 is part of the NanoLM Collections by Mxode, designed to explore the potential of small language models. This specific model is a 0.3 billion parameter instruction-tuned variant, built upon the Qwen2ForCausalLM architecture. It features 12 layers, 896 dimensions, 14 heads, and a 4K sequence length, with approximately 180 million non-embedding parameters.

Key Capabilities

  • Efficient Instruction Following: Despite its small size, the model demonstrates strong capabilities in following instructions.
  • Qwen2 Architecture: Based on the Qwen2ForCausalLM architecture, specifically a reduced version of Qwen2-0.5B with fewer layers.
  • English-only Support: The model is currently optimized for and supports English language tasks exclusively.

Good For

  • Resource-constrained environments: Its small parameter count makes it suitable for deployment where computational resources are limited.
  • English instruction-based tasks: Excels in scenarios requiring efficient processing of English instructions.
  • Exploring small model potential: Ideal for researchers and developers interested in the performance of highly compact language models.