uukuguy/speechless-codellama-dolphin-orca-platypus-34b

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Sep 7, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

The uukuguy/speechless-codellama-dolphin-orca-platypus-34b is a 34 billion parameter Code Llama-based model fine-tuned by uukuguy, designed for code generation and understanding with a 32768 token context length. It was fine-tuned using a blend of Dolphin, Orca, and Platypus datasets, achieving a humaneval-python pass@1 score of 70.12. This model specializes in code completion and infilling, making it suitable for developers working on Python-centric coding tasks.

Loading preview...

Model Overview

The uukuguy/speechless-codellama-dolphin-orca-platypus-34b is a 34 billion parameter language model built upon the Code Llama architecture. Developed by uukuguy, this model distinguishes itself through its unique fine-tuning process, incorporating a blend of Dolphin (1% GPT4), Orca (1% GPT4), and Platypus (100%) datasets. This strategic fine-tuning aims to enhance its performance in code-related tasks.

Key Capabilities & Performance

  • Code Generation: Excels at code completion and infilling tasks.
  • Python Proficiency: Achieves a humaneval-python pass@1 score of 70.12, demonstrating strong capabilities in Python code generation, outperforming several Code Llama variants.
  • General Language Understanding: Exhibits solid performance on general LLM benchmarks, with an average score of 56.80 on the Open LLM Leaderboard, including 53.47 on MMLU and 74.13 on HellaSwag.
  • Context Length: Supports a substantial context window of 32768 tokens, beneficial for handling larger codebases or complex prompts.

What Makes This Model Different?

This model's primary differentiator is its specific fine-tuning on a curated mix of instruction datasets (Dolphin, Orca, Platypus) on top of the already powerful Phind-CodeLlama-34B. This approach aims to imbue it with enhanced instruction-following and reasoning capabilities, particularly for coding scenarios, as evidenced by its competitive humaneval-python score.

Ideal Use Cases

  • Code Completion Tools: Integrating into IDEs for intelligent code suggestions.
  • Automated Code Generation: Generating Python code snippets or functions based on natural language descriptions.
  • Code Understanding: Assisting in understanding existing codebases through infilling or contextual completion.
  • Developer Assistants: Powering applications that help developers write and debug code more efficiently.