uukuguy/speechless-codellama-dolphin-orca-platypus-34b
The uukuguy/speechless-codellama-dolphin-orca-platypus-34b is a 34 billion parameter Code Llama-based model fine-tuned by uukuguy, designed for code generation and understanding with a 32768 token context length. It was fine-tuned using a blend of Dolphin, Orca, and Platypus datasets, achieving a humaneval-python pass@1 score of 70.12. This model specializes in code completion and infilling, making it suitable for developers working on Python-centric coding tasks.
Loading preview...
Model Overview
The uukuguy/speechless-codellama-dolphin-orca-platypus-34b is a 34 billion parameter language model built upon the Code Llama architecture. Developed by uukuguy, this model distinguishes itself through its unique fine-tuning process, incorporating a blend of Dolphin (1% GPT4), Orca (1% GPT4), and Platypus (100%) datasets. This strategic fine-tuning aims to enhance its performance in code-related tasks.
Key Capabilities & Performance
- Code Generation: Excels at code completion and infilling tasks.
- Python Proficiency: Achieves a humaneval-python pass@1 score of 70.12, demonstrating strong capabilities in Python code generation, outperforming several Code Llama variants.
- General Language Understanding: Exhibits solid performance on general LLM benchmarks, with an average score of 56.80 on the Open LLM Leaderboard, including 53.47 on MMLU and 74.13 on HellaSwag.
- Context Length: Supports a substantial context window of 32768 tokens, beneficial for handling larger codebases or complex prompts.
What Makes This Model Different?
This model's primary differentiator is its specific fine-tuning on a curated mix of instruction datasets (Dolphin, Orca, Platypus) on top of the already powerful Phind-CodeLlama-34B. This approach aims to imbue it with enhanced instruction-following and reasoning capabilities, particularly for coding scenarios, as evidenced by its competitive humaneval-python score.
Ideal Use Cases
- Code Completion Tools: Integrating into IDEs for intelligent code suggestions.
- Automated Code Generation: Generating Python code snippets or functions based on natural language descriptions.
- Code Understanding: Assisting in understanding existing codebases through infilling or contextual completion.
- Developer Assistants: Powering applications that help developers write and debug code more efficiently.