QKing-Official/EndAI-Small
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Apr 23, 2026License:mitArchitecture:Transformer Open Weights Cold
EndAI-Small by QKing-Official is a 1.1 billion parameter language model built upon the TinyLlama architecture. It was trained on a subset of the HuggingFaceH4/ultrachat_200k dataset, specifically optimized for efficient operation on both CPU and GPU. This model is designed for rapid inference in resource-constrained environments, making it suitable for applications requiring a compact and fast AI solution.
Loading preview...
EndAI-Small: A Compact and Efficient LLM
EndAI-Small, developed by QKing-Official, is a 1.1 billion parameter language model based on the TinyLlama architecture. This model prioritizes efficiency and speed, making it a strong candidate for applications where computational resources are limited.
Key Capabilities
- Lightweight Design: Built on TinyLlama, ensuring a small footprint.
- Optimized for Speed: Engineered for rapid inference on both CPUs and GPUs.
- Instruction-Tuned: Trained on 3% of the HuggingFaceH4/ultrachat_200k dataset, providing basic instruction-following capabilities.
Good For
- Edge Devices: Ideal for deployment on hardware with limited memory and processing power.
- Local Inference: Enables quick AI processing directly on user devices without requiring powerful cloud infrastructure.
- Rapid Prototyping: Its small size allows for fast experimentation and integration into projects.
- CPU-Bound Applications: Specifically designed to perform well even on CPU-only setups, broadening its accessibility.