openchat/openchat-3.6-8b-20240522

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 7, 2024License:llama3Architecture:Transformer0.2K Warm

OpenChat/openchat-3.6-8b-20240522 is an 8 billion parameter instruction-tuned causal language model developed by OpenChat, based on the Llama 3 architecture with an 8192-token context window. This model is optimized using mixed-quality data and is presented as a top-performing open-source 8B model, outperforming Llama-3-8B-Instruct on various benchmarks. It is primarily designed for general chat, coding, and diverse language tasks, offering strong performance in a compact size.

Loading preview...

OpenChat-3.6-8B-20240522 Overview

OpenChat-3.6-8B-20240522 is an 8 billion parameter instruction-tuned model built upon the Llama 3 architecture, featuring an 8192-token context length. Developed by OpenChat, this iteration is highlighted for its performance, claiming to surpass Llama-3-8B-Instruct and other open-source finetunes/merges on various benchmarks. The model's training methodology involves leveraging mixed-quality data to enhance its capabilities.

Key Capabilities

  • High Performance: Positioned as a leading open-source 8B model based on internal benchmarks.
  • Optimized for General Tasks: Suitable for coding, chat, and a wide range of general language tasks.
  • OpenAI API Compatibility: Can be served via an OpenAI-compatible API server, optimized for high-throughput deployment using vLLM.
  • Flexible Conversation Templates: Utilizes a modified Llama 3 Instruct template with specific role names (GPT4 Correct User, GPT4 Correct Assistant).

Good For

  • General-purpose conversational AI: Excels in chat applications and interactive dialogues.
  • Coding assistance: Capable of handling programming and coding challenges.
  • Deployment on consumer hardware: Optimized for efficient serving, requiring a consumer GPU with 24GB RAM for the vLLM-based server.

Limitations

  • Inherits limitations from its foundation models, affecting complex reasoning, mathematical tasks, and advanced programming.
  • Prone to "hallucination" of non-existent or inaccurate information.
  • May generate harmful, biased, or unsafe responses, necessitating additional safety measures for sensitive applications.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p