paulml/ECE-ILAB-Q1

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:72.7BQuant:FP8Ctx Length:32kPublished:Jun 6, 2024License:otherArchitecture:Transformer Warm

ECE-ILAB-Q1 is a 72.7 billion parameter instruction-tuned causal language model, developed by Louis Garcia, Matthieu Jollard, Andre-Louis Rochet, and Paul Lemaistre from ECE engineering school and TW3 Partners. This model is a merge of Qwen/Qwen2-72B-Instruct and cognitivecomputations/dolphin-2.9.2-qwen2-72b, designed to leverage the strengths of both base models. With a context length of 131072 tokens, it aims to provide robust performance across various tasks, as indicated by its evaluation on the Open LLM Leaderboard.

Loading preview...

ECE-ILAB-Q1: Merged 72.7B Parameter Model

ECE-ILAB-Q1 is a substantial 72.7 billion parameter language model, developed by a team including Louis Garcia, Matthieu Jollard, Andre-Louis Rochet, and Paul Lemaistre, with sponsorship from ECE engineering school. This model is a strategic merge of two powerful base models: Qwen/Qwen2-72B-Instruct and cognitivecomputations/dolphin-2.9.2-qwen2-72b, utilizing MergeKit.

Key Capabilities & Performance

This model is designed to combine the strengths of its constituent models, aiming for broad applicability in instruction-following tasks. Its performance has been evaluated on the Open LLM Leaderboard, with detailed results available here. Notable scores include:

  • IFEval (0-Shot): 78.65
  • BBH (3-Shot): 53.70
  • MMLU-PRO (5-shot): 50.06

With a substantial context length of 131072 tokens, ECE-ILAB-Q1 is well-suited for processing and generating extensive text, making it a versatile option for complex language understanding and generation tasks.

Good For

  • Applications requiring a large context window for detailed analysis or long-form content generation.
  • Tasks benefiting from a merged model's combined strengths in instruction following and general language understanding.
  • Researchers and developers looking for a robust 72.7B parameter model with a strong foundation from Qwen2 and Dolphin variants.