Steelskull/L3.3-Electra-R1-70b

5.0 based on 1 review
Warm
Public
70B
FP8
32768
Mar 2, 2025
License: eva-llama3.3
Hugging Face
Overview

L3.3-Electra-R1-70b: An Advanced 70B Language Model

L3.3-Electra-R1-70b is the latest iteration in the "Unnamed" series by SteelSkull, a 70 billion parameter model built upon a custom DeepSeek R1 Distill base, specifically TheSkullery/L3.1x3.3-Hydroblated-R1-70B-v4.4. This model leverages the SCE merge method to integrate various specialized components, ensuring a robust and coherent architecture. It processes data using float32 and outputs in bfloat16 for optimized performance.

Key Capabilities & Differentiators

  • Enhanced Intelligence & Coherence: User feedback consistently highlights Electra-R1's superior intelligence and coherence, making it a new gold standard and baseline for the series.
  • Deep Character Insights: The model demonstrates a unique ability to provide deep character insights and unprompted exploration of inner thoughts and motivations, particularly valuable for narrative and roleplay applications.
  • Advanced Reasoning: Through proper prompting, Electra-R1 exhibits advanced reasoning capabilities.
  • Custom Base Architecture: Built on the Hydroblated-R1 base, known for stability and enhanced reasoning, with SCE merge settings precisely tuned based on extensive community feedback from over 10 different models.
  • Specialized Component Integration: Incorporates models like EVA-LLaMA-3.33-70B-v0.0 for core capabilities, Wayfarer-Large-70B-Llama-3.3 for storytelling and roleplay, L3.3-70B-Euryale-v2.3 as an all-rounder RP model, 70B-L3.3-Cirrus-x1 for improved coherence, L3.1-70B-Hanami-x1 for balanced responses, and Anubis-70B-v1 for enhanced detail. It also includes Negative_LLAMA_70B and Fallen-Llama-3.3-R1-70B-v1 for reduced bias.

Recommended Use Cases

Electra-R1 is particularly well-suited for:

  • Complex Narrative Generation: Its ability to provide deep character insights and coherent storytelling makes it excellent for creative writing.
  • Advanced Roleplay: Excels in scenarios requiring nuanced character interactions and exploration of motivations.
  • Reasoning-Intensive Tasks: Benefits from its enhanced reasoning capabilities for more analytical prompts.

Recommended sampler settings by @Geechan are provided, including a static temperature of 1.0 (or dynamic 0.8-1.05), Min P of 0.025-0.03, and specific DRY settings (Multiplier: 0.8, Base: 1.74, Length: 4-6). The model also supports advanced reasoning configurations using XML tags like <think> for structured thought processes.