openchat/openchat_3.5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 30, 2023License:apache-2.0Architecture:Transformer1.1K Open Weights Warm

OpenChat 3.5 is a 7 billion parameter open-source language model developed by OpenChat, fine-tuned using the C-RLFT strategy. It achieves performance comparable to ChatGPT (March) on various benchmarks, notably excelling in reasoning and mathematical tasks. The model is designed for high-performance conversational AI and general-purpose language generation, offering strong capabilities despite its compact size.

Loading preview...

OpenChat 3.5: High-Performance 7B Language Model

OpenChat 3.5 is a 7 billion parameter open-source language model developed by OpenChat, distinguished by its fine-tuning with C-RLFT (a strategy inspired by offline reinforcement learning) on mixed-quality data without preference labels. This approach enables the model to achieve exceptional performance, often comparable to or surpassing larger models and even ChatGPT (March) on several benchmarks.

Key Capabilities & Performance

  • Exceptional Benchmark Scores: Achieves an MT-bench score of 7.81, outperforming many 70B models. It also demonstrates strong results in AGIEval (47.4), BBH MC (47.6), TruthfulQA (59.1), HumanEval (55.5), and particularly in mathematical reasoning with GSM8K (77.3) and MATH (28.6).
  • Efficient Performance: Despite being a 7B model, it shows competitive results against proprietary models like Grok-0 (33B parameters) across various metrics, including average score, MATH, and GSM8k.
  • Optimized for Deployment: The model is designed for high-throughput deployment, with an OpenAI-compatible API server optimized using vLLM, capable of running on consumer GPUs with 24GB RAM.
  • Coding Mode: Supports a dedicated "Coding Mode" for programming challenges, as demonstrated by its HumanEval score.

Good For

  • General Conversational AI: Its strong MT-bench score indicates robust performance in chat-based interactions.
  • Reasoning and Mathematical Tasks: Excels in benchmarks like GSM8K and MATH, making it suitable for applications requiring numerical and logical reasoning.
  • Code Generation: With a HumanEval score of 55.5, it demonstrates solid capabilities in programming tasks.
  • Resource-Constrained Environments: Its 7B parameter size allows for efficient deployment on more accessible hardware while maintaining high performance.