appvoid/palmer-002.5
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Jan 19, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

appvoid/palmer-002.5 is a 1.1 billion parameter Merging of Experts (MEoE) language model, built upon the palmer-002-2401 base. It is specifically biased as an assistant without requiring explicit prompts and is optimized for creative writing tasks. This model demonstrates competitive performance against other 1B-parameter models on various benchmarks, despite its compact size, making it suitable for edge device deployment.

Loading preview...

appvoid/palmer-002.5: A Compact Merging of Experts Model

appvoid/palmer-002.5 is a 1.1 billion parameter "Merging of Experts" (MEoE) language model, developed by appvoid. It leverages palmer-002-2401 as its base and is uniquely biased as an assistant, designed to function effectively without explicit prompting. This model represents an advancement in small language models (SMLs), aiming to empower edge devices such as mobile phones and Raspberry Pis due to its compactness.

Key Capabilities and Performance

Despite being up to 40% smaller than some counterparts, palmer-002.5 exhibits strong performance compared to other 1B-parameter models across several benchmarks. Its philosophy deviates from previous palmer-family models by incorporating more data for enhanced power. The model shows competitive scores:

  • MMLU: 0.2534
  • ARC-C: 0.3370
  • OBQA: 0.3740
  • HellaSwag: 0.6128
  • PIQA: 0.7486
  • Winogrande: 0.6535
  • Average: 0.4965

This places it favorably against models like tinyllama-chat and zyte-1b in its class.

Primary Use Case

palmer-002.5 excels in creative writing, demonstrating its ability to generate coherent and imaginative text, as shown in examples where it completes prompts with poetic and descriptive outputs. Its compact size and efficient design make it particularly suitable for applications requiring on-device AI capabilities.

Limitations

As with other transformer-based language models, palmer-002.5 can produce hallucinations (incorrect or false statements). Therefore, it should be used with caution in sensitive scenarios where factual accuracy is critical.