Intel/neural-chat-7b-v3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 25, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

Intel/neural-chat-7b-v3 is a 7 billion parameter large language model developed by Intel, fine-tuned from Mistral-7B-v0.1 on the Open-Orca/SlimOrca dataset using Direct Performance Optimization (DPO) with Intel/orca_dpo_pairs. This model, with an 8192-token context length, is optimized for general language-related tasks and demonstrates improved performance over its base model on the LLM Leaderboard benchmarks, particularly in areas like ARC and TruthfulQA. It is intended for inference on various language tasks, offering a robust foundation for further fine-tuning.

Loading preview...

Neural-Chat-v3: An Intel-Optimized 7B LLM

Intel/neural-chat-7b-v3 is a 7 billion parameter large language model developed by Intel, building upon the mistralai/Mistral-7B-v0.1 architecture. It was fine-tuned on the Open-Orca/SlimOrca dataset and further aligned using the Direct Performance Optimization (DPO) method with the Intel/orca_dpo_pairs dataset, all processed on Intel Gaudi 2 hardware. This model maintains an 8192-token context length, consistent with its base model.

Key Capabilities

  • Enhanced Performance: Demonstrates significant performance improvements over the base Mistral-7B-v0.1 on various benchmarks, as detailed on the LLM Leaderboard.
  • DPO Alignment: Utilizes Direct Performance Optimization for improved instruction following and response quality.
  • Intel Hardware Optimization: Developed and optimized on Intel Gaudi 2 processors, with resources available for reproduction and deployment.
  • Flexible Inference: Supports FP32, BF16, and INT4 inference, leveraging intel_extension_for_transformers and intel_extension_for_pytorch for optimized performance.

Good for

  • General Language Tasks: Suitable for a wide range of language-related inference tasks.
  • Foundation for Fine-tuning: Serves as a strong base model that can be further fine-tuned for specific applications.
  • Intel Hardware Users: Optimized for deployment and performance on Intel Gaudi 2 and other Intel AI hardware.
  • Research and Development: Provides a robust, open-source model for exploring DPO methods and LLM performance improvements.