CodeUp-Llama-2-13b-chat-hf: Multilingual Code Generation

CodeUp is a 13 billion parameter instruction-following model built upon the Llama 2 foundation, developed by Juyong Jiang and Sunghun Kim. Unlike general-domain LLMs, CodeUp is specifically designed and optimized for multilingual code generation tasks. It addresses the challenge of high computational resource requirements by employing parameter-efficient fine-tuning (PEFT) methods, such as LoRA, allowing for training and inference on consumer-grade hardware like a single RTX 3090.

Key Capabilities & Features

Multilingual Code Generation: Specialized in generating code from natural language instructions across various programming languages, with a focus on Python.
Parameter-Efficient Fine-Tuning (PEFT): Utilizes techniques like LoRA for efficient adaptation of the Llama 2 model, making it accessible for academic budgets and consumer hardware.
High-Quality Instruction Data: Trained on a meticulously filtered dataset of 19,000 instruction-following examples for code generation. This dataset was derived from Code Alpaca, with rigorous filtering to remove ambiguous or irrelevant prompts and ensure programming language specificity (defaulting to Python).
Llama 2 Foundation: Benefits from the robust capabilities of the Llama 2 architecture, adapted for code-specific applications.

Good For

Developers and researchers seeking an open-source, instruction-following LLM for code generation.
Projects requiring natural language to code translation, particularly for Python.
Environments with limited computational resources that benefit from PEFT-enabled models.
Experimentation with Llama 2-based models fine-tuned for specialized coding tasks.

Overview

CodeUp-Llama-2-13b-chat-hf: Multilingual Code Generation

Key Capabilities & Features

Good For

Full Model Card (README)