tlphams/zoyllm-7b-slimorca

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 4, 2023License:cc-by-nc-sa-4.0Architecture:Transformer Open Weights Cold

ZoyLLM-7B-SlimOrca is a 7 billion parameter LoRA-finetuned generative text model developed by Pham Tung Lam and Nguyen Duc Nhan, built upon the Mistral-7B-v0.1 base architecture. It incorporates Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer, and is specifically fine-tuned using a chatml template for conversational AI. This model is designed for general text generation and chat applications, demonstrating performance that surpasses Llama 2 13B on tested benchmarks.

Loading preview...

ZoyLLM-7B-SlimOrca: A LoRA-Finetuned Mistral-7B Model

ZoyLLM-7B-SlimOrca is a 7 billion parameter large language model developed by Pham Tung Lam and Nguyen Duc Nhan. It is built on the Mistral-7B-v0.1 base model, which has shown to outperform Llama 2 13B across various benchmarks. The model leverages advanced architectural features including Grouped-Query Attention and Sliding-Window Attention, alongside a Byte-fallback BPE tokenizer.

Key Capabilities & Training

  • Base Model Performance: Utilizes Mistral-7B-v0.1, known for strong performance relative to its size.
  • Fine-tuning: LoRA-finetuned on a diverse dataset including 20 self-introduction samples, 100k randomly sampled SlimOrca samples, and the EverythingLM v3 dataset.
  • Chat Template: Optimized for conversational interactions using a chatml template, making it suitable for dialogue-based applications.
  • Architectural Enhancements: Incorporates Grouped-Query Attention and Sliding-Window Attention for efficient processing.

Performance & Use Cases

Evaluated on the Open LLM Leaderboard, ZoyLLM-7B-SlimOrca achieves an average score of 51.44. Specific benchmark results include 50.60 on AI2 Reasoning Challenge, 72.12 on HellaSwag, and 48.78 on MMLU. Its fine-tuning on conversational datasets and chat template makes it well-suited for:

  • General Chatbots: Engaging in natural language conversations.
  • Question Answering: Providing concise answers based on provided context, as demonstrated in RAG testbench samples.
  • Personalized AI: Capable of self-introduction and maintaining a defined persona.