Intel/neural-chat-7b-v3
Intel/neural-chat-7b-v3 is a 7 billion parameter large language model developed by Intel, fine-tuned from Mistral-7B-v0.1 on the Open-Orca/SlimOrca dataset using Direct Performance Optimization (DPO) with Intel/orca_dpo_pairs. This model, with an 8192-token context length, is optimized for general language-related tasks and demonstrates improved performance over its base model on the LLM Leaderboard benchmarks, particularly in areas like ARC and TruthfulQA. It is intended for inference on various language tasks, offering a robust foundation for further fine-tuning.
Loading preview...
Neural-Chat-v3: An Intel-Optimized 7B LLM
Intel/neural-chat-7b-v3 is a 7 billion parameter large language model developed by Intel, building upon the mistralai/Mistral-7B-v0.1 architecture. It was fine-tuned on the Open-Orca/SlimOrca dataset and further aligned using the Direct Performance Optimization (DPO) method with the Intel/orca_dpo_pairs dataset, all processed on Intel Gaudi 2 hardware. This model maintains an 8192-token context length, consistent with its base model.
Key Capabilities
- Enhanced Performance: Demonstrates significant performance improvements over the base Mistral-7B-v0.1 on various benchmarks, as detailed on the LLM Leaderboard.
- DPO Alignment: Utilizes Direct Performance Optimization for improved instruction following and response quality.
- Intel Hardware Optimization: Developed and optimized on Intel Gaudi 2 processors, with resources available for reproduction and deployment.
- Flexible Inference: Supports FP32, BF16, and INT4 inference, leveraging
intel_extension_for_transformersandintel_extension_for_pytorchfor optimized performance.
Good for
- General Language Tasks: Suitable for a wide range of language-related inference tasks.
- Foundation for Fine-tuning: Serves as a strong base model that can be further fine-tuned for specific applications.
- Intel Hardware Users: Optimized for deployment and performance on Intel Gaudi 2 and other Intel AI hardware.
- Research and Development: Provides a robust, open-source model for exploring DPO methods and LLM performance improvements.