L3.3-Electra-R1-70b: An Advanced 70B Language Model
L3.3-Electra-R1-70b is the latest iteration in the "Unnamed" series by SteelSkull, a 70 billion parameter model built upon a custom DeepSeek R1 Distill base, specifically TheSkullery/L3.1x3.3-Hydroblated-R1-70B-v4.4. This model leverages the SCE merge method to integrate various specialized components, ensuring a robust and coherent architecture. It processes data using float32 and outputs in bfloat16 for optimized performance.
Key Capabilities & Differentiators
- Enhanced Intelligence & Coherence: User feedback consistently highlights Electra-R1's superior intelligence and coherence, making it a new gold standard and baseline for the series.
- Deep Character Insights: The model demonstrates a unique ability to provide deep character insights and unprompted exploration of inner thoughts and motivations, particularly valuable for narrative and roleplay applications.
- Advanced Reasoning: Through proper prompting, Electra-R1 exhibits advanced reasoning capabilities.
- Custom Base Architecture: Built on the
Hydroblated-R1base, known for stability and enhanced reasoning, with SCE merge settings precisely tuned based on extensive community feedback from over 10 different models. - Specialized Component Integration: Incorporates models like
EVA-LLaMA-3.33-70B-v0.0for core capabilities,Wayfarer-Large-70B-Llama-3.3for storytelling and roleplay,L3.3-70B-Euryale-v2.3as an all-rounder RP model,70B-L3.3-Cirrus-x1for improved coherence,L3.1-70B-Hanami-x1for balanced responses, andAnubis-70B-v1for enhanced detail. It also includesNegative_LLAMA_70BandFallen-Llama-3.3-R1-70B-v1for reduced bias.
Recommended Use Cases
Electra-R1 is particularly well-suited for:
- Complex Narrative Generation: Its ability to provide deep character insights and coherent storytelling makes it excellent for creative writing.
- Advanced Roleplay: Excels in scenarios requiring nuanced character interactions and exploration of motivations.
- Reasoning-Intensive Tasks: Benefits from its enhanced reasoning capabilities for more analytical prompts.
Recommended sampler settings by @Geechan are provided, including a static temperature of 1.0 (or dynamic 0.8-1.05), Min P of 0.025-0.03, and specific DRY settings (Multiplier: 0.8, Base: 1.74, Length: 4-6). The model also supports advanced reasoning configurations using XML tags like <think> for structured thought processes.