Zen Nano: Ultra-Lightweight Model for Edge AI
Zen Nano is a compact 0.6 billion parameter causal language model developed by Zen AI Team (Hanzo AI), specifically engineered for deployment on edge devices and mobile platforms. It offers a 32K token context window and supports both English and Chinese, making it versatile for various global applications.
Key Capabilities & Features
- Ultra-Lightweight: At just 0.6B parameters, it's ideal for environments with limited resources.
- High Efficiency: Achieves 44,000 tokens/sec on M3 Max (MLX) and 8,000 tokens/sec on iPhone 15 Pro, with memory usage as low as 0.3GB (Q2_K).
- Multilingual Support: Capable in both English and Chinese.
- Flexible Formats: Available in PyTorch, MLX, and GGUF (Q2_K to F16) for broad compatibility.
- Abliteration: Features a unique 'abliteration' process that removes refusal behaviors by nullifying the "refusal direction" in the model's residual stream, enabling unrestricted research and application-layer safety management.
Ideal Use Cases
- Edge AI: Running AI tasks directly on devices without cloud dependency.
- Mobile Applications: Powering chatbots and AI assistants on smartphones.
- IoT Devices: Providing intelligence to internet-of-things hardware.
- Resource-Constrained Environments: Where power and computational resources are limited.
- Real-time Inference: For applications requiring immediate responses.