openchat/openchat-3.5-0106

Cold
Public
7B
FP8
4096
License: apache-2.0
Hugging Face
Overview

OpenChat-3.5-0106: A High-Performing 7B Language Model

OpenChat-3.5-0106 is a 7 billion parameter model developed by OpenChat, designed to excel in general chat, coding, and mathematical reasoning tasks. This version represents a substantial advancement, particularly in coding, demonstrating a 15-point improvement over its predecessor, OpenChat-3.5. It achieves a HumanEval score of 71.3 and surpasses ChatGPT (March) and Grok-1 on multiple benchmarks, including HumanEval, MATH, and GSM8K.

Key Capabilities

  • Dual Operating Modes: Features a 'Default Mode (GPT4 Correct)' for general chat and coding, and a 'Mathematical Reasoning Mode' tailored for solving complex math problems.
  • Enhanced Coding Performance: Achieves 71.3 on HumanEval and 65.9 on HumanEval+, outperforming ChatGPT (December 2023) in coding benchmarks.
  • Strong Benchmark Results: Leads its 7B class with an average score of 64.5, outperforming larger models like Grok-0 (33B) and Grok-1 on several key metrics.
  • Experimental Evaluator Support: Includes capabilities for evaluating responses and providing feedback, aligning with frameworks like Prometheus.

Good For

  • Developers requiring a powerful 7B model for code generation and analysis.
  • Applications needing robust mathematical problem-solving abilities.
  • General-purpose conversational AI where high performance is critical.
  • Research into evaluator and feedback mechanisms for LLMs.