Phind/Phind-CodeLlama-34B-v2
Phind-CodeLlama-34B-v2 is a 34 billion parameter instruction-tuned causal language model developed by Phind. Fine-tuned on 1.5 billion tokens of high-quality programming data, it achieves 73.8% pass@1 on HumanEval, making it proficient in multiple programming languages including Python, C/C++, and TypeScript. This model is optimized for code generation and programming assistance, offering strong performance among open-source models.
Loading preview...
Phind-CodeLlama-34B-v2 Overview
Phind-CodeLlama-34B-v2 is an advanced 34 billion parameter instruction-tuned model developed by Phind, building upon its predecessor, Phind-CodeLlama-34B-v1. This version was fine-tuned on an additional 1.5 billion tokens of proprietary, high-quality programming-related data, specifically focusing on instruction-answer pairs rather than code completion.
Key Capabilities
- Exceptional Code Generation: Achieves a notable 73.8% pass@1 on HumanEval, positioning it as a leading open-source model for code tasks.
- Instruction-Tuned: Designed for steerability and ease of use, accepting the Alpaca/Vicuna instruction format.
- Multi-Lingual Proficiency: Demonstrates strong capabilities across various programming languages, including Python, C/C++, and TypeScript.
- Optimized Training: Utilized DeepSpeed ZeRO 3 and Flash Attention 2 for efficient training on 32 A100-80GB GPUs, completing in 15 hours.
Good For
- Programming Assistance: Ideal for developers seeking an intelligent assistant for coding tasks, problem-solving, and generating code snippets.
- Multi-Language Development: Suitable for projects involving multiple programming languages due to its broad proficiency.
- Research and Development: Provides a strong baseline for further fine-tuning or integration into larger systems requiring robust code generation capabilities.