bobofrut/ladybird-base-7B-v8

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 23, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Ladybird-base-7B-v8 is a 7 billion parameter Large Language Model developed by bobofrut, built upon the efficient Mistral architecture with a 4096-token context length. It incorporates Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE Tokenizer to enhance performance and language coverage. This model is designed for complex language understanding and generation tasks, demonstrating strong performance across various benchmarks including Winogrande, TruthfulQA, and GSM8K.

Loading preview...

Model Overview

Ladybird-base-7B-v8 is a 7 billion parameter Large Language Model developed by bobofrut, leveraging the efficient Mistral architecture. This model integrates several advanced architectural features to optimize its performance in complex language understanding and generation tasks.

Key Architectural Features

  • Grouped-Query Attention: Enhances attention mechanisms by grouping queries, reducing computational overhead while maintaining model quality.
  • Sliding-Window Attention: Improves the model's ability to manage long-range dependencies by focusing on relevant input segments, leading to better understanding and coherence.
  • Byte-fallback BPE Tokenizer: Provides robust tokenization by combining Byte-Pair Encoding (BPE) with a fallback mechanism for out-of-vocabulary bytes, ensuring comprehensive language coverage.

Performance Highlights

The model demonstrates solid performance across various evaluation benchmarks:

  • Winogrande: Achieves an accuracy of 0.8272.
  • TruthfulQA_MC2: Scores an accuracy of 0.7736.
  • GSM8K: Reaches an exact match score of 0.7650.
  • Arc_Challenge: Shows an accuracy of 0.6749.

Instruction Format

To fully utilize the model's instruction fine-tuning capabilities, users should adhere to the ChatML format for prompt construction, ensuring accurate and context-aware responses.