paloalma/ECE-TW3-JRGL-V1

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Apr 3, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

paloalma/ECE-TW3-JRGL-V1 is a 69 billion parameter language model developed by Louis Garcia and Matthieu Jollard from ECE, created by merging ShinojiResearch/Senku-70B-Full and 152334H/miqu-1-70b-sf using mergekit. This model is optimized for emotional intelligence tasks, achieving a score of 83.07 on EQ-Bench V2. With a context length of 32768 tokens, it is suitable for applications requiring nuanced understanding of emotional context.

Loading preview...

ECE-TW3-JRGL-V1: Merged Model for Emotional Intelligence

ECE-TW3-JRGL-V1 is a 69 billion parameter language model developed by engineering students Louis Garcia and Matthieu Jollard from the French Engineering School ECE, under the supervision of Andre-Louis Rochet and Paul Lemaistre from TW3 Partners. This model was created by merging two base models, ShinojiResearch/Senku-70B-Full and 152334H/miqu-1-70b-sf, using the mergekit tool.

Key Capabilities & Performance

  • Model Architecture: A merge of two 70B parameter models, resulting in a 69B parameter model.
  • Merging Method: Utilizes slerp (Spherical Linear Interpolation) for merging, with specific layer ranges and parameter filters applied to self-attention and MLP layers.
  • Emotional Intelligence: Demonstrates strong performance in emotional intelligence benchmarks, scoring 83.07 on EQ-Bench V2. This benchmark evaluates a model's ability to understand and respond to emotional cues.
  • Context Length: Supports a context length of 32768 tokens, allowing for processing of extensive inputs.

Why ECE-TW3-JRGL-V1 is Different

This model stands out due to its specific optimization for emotional intelligence, a less common focus for merged models of this scale. Its strong performance on EQ-Bench V2 suggests it is particularly well-suited for applications where understanding and generating emotionally nuanced text is crucial. The merging strategy, combining two high-performing base models, aims to leverage their respective strengths to achieve this specialized capability.

Ideal Use Cases

Consider using ECE-TW3-JRGL-V1 for applications requiring:

  • Emotional analysis and sentiment understanding.
  • Generating empathetic or emotionally appropriate responses.
  • Role-playing or conversational AI where emotional context is vital.
  • Content creation that requires a nuanced emotional tone.

This model is a strong candidate for tasks where a high degree of emotional intelligence is a primary requirement, differentiating it from general-purpose LLMs.