ajibawa-2023/Code-290k-13B
The ajibawa-2023/Code-290k-13B is a 13 billion parameter language model, fine-tuned from the Llama-2 base model, specifically designed for multi-language code generation with detailed explanations. Trained on a dataset of 290,000 code examples across Python, Java, JavaScript, Go, C++, Rust, Ruby, SQL, and more, it excels at providing both functional code and comprehensive accompanying explanations. This model is optimized for developers seeking not just code, but also a clear understanding of its logic and implementation.
Loading preview...
ajibawa-2023/Code-290k-13B: Code Generation with Explanations
This 13 billion parameter model, fine-tuned from Llama-2, specializes in generating code across multiple programming languages alongside detailed explanations. Developed by ajibawa-2023, it addresses the common challenge of LLMs making mistakes in code generation by emphasizing clarity and understanding.
Key Capabilities
- Multi-language Code Generation: Supports Python, Java, JavaScript, Go, C++, Rust, Ruby, SQL, MySQL, R, Julia, Haskell, and more.
- Detailed Explanations: Provides comprehensive explanations accompanying generated code, enhancing developer understanding.
- Extensive Training Data: Trained on approximately 290,000 code examples, each featuring two conversations in Vicuna/ShareGPT format, ensuring robust performance.
- Base Model: Built upon the Llama-2 architecture by Meta.
Training Details
The model was trained for 165 hours over 3 epochs on 4 x A100 80GB GPUs, utilizing the DeepSpeed codebase. The training dataset, Code-290k-ShareGPT, combines and expands upon previous datasets like Python-Code-23k-ShareGPT and Code-74k-ShareGPT.
Performance
On the Open LLM Leaderboard, the model achieves an average score of 52.96, with notable scores including 81.55 on HellaSwag (10-Shot) and 72.69 on Winogrande (5-shot).
Usage
Users can interact with the model using a prompt format similar to Vicuna/ShareGPT v1.1, designed for conversational code generation with explanations. Quantized versions (GPTQ, GGUF, AWQ, Exllama v2) are also available for optimized deployment.