GenueAI/Tessera-4

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jul 1, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Tessera 4 by GenueAI is a specialized mini-model focused on reasoning, distilled using ORPO from DeepSeek-R1. It achieves frontier-level performance in logic and mathematics, including 95% on GSM8K and 93% on ARC-Challenge, while remaining small enough to run on consumer hardware with 8GB+ VRAM. This model prioritizes logical accuracy over general knowledge, making it highly efficient for complex problem-solving tasks.

Loading preview...

Tessera 4: ORPO-Distilled Reasoning Engine

Tessera 4, developed by GenueAI, is a specialized mini-model demonstrating that high-performance reasoning is achievable without massive scale. It leverages ORPO (Odds Ratio Preference Optimization) and distillation from DeepSeek-R1 to excel in logic and mathematics, designed to run efficiently on consumer hardware with 8GB+ VRAM.

Key Capabilities & Performance

  • Exceptional Reasoning: Achieves 95% on GSM8K and 93% on ARC-Challenge, surpassing its teacher model, DeepSeek-R1, in these core logic benchmarks.
  • Optimized for Logic: Prioritizes logical accuracy over general trivia, with MMLU scores at 66% as a deliberate trade-off for superior reasoning.
  • Efficient Training: Developed in approximately 8 hours on a single RTX 3090, focusing on Chain-of-Thought (CoT) path correction to eliminate verbose output.
  • Demonstrated Accuracy: Showcases 100% correctness in high-precision math (e.g., 15 factorial), unit conversion, and complex logical branching puzzles.

Hardware & Usage

  • VRAM: Recommended 8GB+.
  • Compatibility: Optimized for LM Studio, Ollama, and llama.cpp.
  • Prompt Format: Requires a specific DeepSeek-V3/R1 style prompt template for optimal performance:
    <|im_start|>system
    You are a highly logical reasoning engine. Think step-by-step.<|im_end|>
    <|im_start|>user
    [Your Question Here]<|im_end|>
    <|im_start|>assistant
    <|thought|>

Good For

  • Applications requiring high-precision mathematical and logical problem-solving.
  • Scenarios where efficient, accurate reasoning is critical on resource-constrained hardware.
  • Developers seeking a model optimized for Chain-of-Thought processing without excessive verbosity.