QKing-Official/EndAI-Small

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Apr 23, 2026License:mitArchitecture:Transformer Open Weights Cold

EndAI-Small by QKing-Official is a 1.1 billion parameter language model built upon the TinyLlama architecture. It was trained on a subset of the HuggingFaceH4/ultrachat_200k dataset, specifically optimized for efficient operation on both CPU and GPU. This model is designed for rapid inference in resource-constrained environments, making it suitable for applications requiring a compact and fast AI solution.

Loading preview...

EndAI-Small: A Compact and Efficient LLM

EndAI-Small, developed by QKing-Official, is a 1.1 billion parameter language model based on the TinyLlama architecture. This model prioritizes efficiency and speed, making it a strong candidate for applications where computational resources are limited.

Key Capabilities

  • Lightweight Design: Built on TinyLlama, ensuring a small footprint.
  • Optimized for Speed: Engineered for rapid inference on both CPUs and GPUs.
  • Instruction-Tuned: Trained on 3% of the HuggingFaceH4/ultrachat_200k dataset, providing basic instruction-following capabilities.

Good For

  • Edge Devices: Ideal for deployment on hardware with limited memory and processing power.
  • Local Inference: Enables quick AI processing directly on user devices without requiring powerful cloud infrastructure.
  • Rapid Prototyping: Its small size allows for fast experimentation and integration into projects.
  • CPU-Bound Applications: Specifically designed to perform well even on CPU-only setups, broadening its accessibility.