juhwanlee/llmdo-Mistral-7B-case-c-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 4, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The juhwanlee/llmdo-Mistral-7B-case-c-v1 is a 7 billion parameter large language model developed by Juhwan Lee, based on the Mistral-7B-v0.1 architecture. This model is specifically fine-tuned for data ordering tasks, utilizing a dataset sampled from Open-Orca. It incorporates architectural features like Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer, making it suitable for specialized data manipulation applications.

Loading preview...

Model Overview

The juhwanlee/llmdo-Mistral-7B-case-c-v1 is a 7 billion parameter Large Language Model developed by Juhwan Lee. It is built upon the Mistral-7B-v0.1 architecture, which includes advanced features such as Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer. This model has been specifically fine-tuned for data ordering tasks.

Key Capabilities & Training

The primary focus of this model is data ordering. It was fine-tuned using a 100,000-sample dataset randomly selected from the Open-Orca dataset. This specialized training aims to optimize its performance for tasks requiring structured data arrangement.

Performance Benchmarks

Evaluations on the Open LLM Leaderboard show the model's performance across various metrics:

  • Avg.: 62.16
  • AI2 Reasoning Challenge (25-Shot): 62.03
  • HellaSwag (10-Shot): 83.55
  • MMLU (5-Shot): 62.69
  • TruthfulQA (0-shot): 45.82
  • Winogrande (5-shot): 79.08
  • GSM8k (5-shot): 39.80

Detailed results are available on the Hugging Face Open LLM Leaderboard.

When to Use This Model

This model is particularly suited for use cases that involve data ordering or require a foundational Mistral-7B model with specific fine-tuning for structured data manipulation. Its specialized training makes it a candidate for applications where the arrangement and sequencing of data are critical.