ajibawa-2023/Code-13B: Code Generation with Explanations
ajibawa-2023/Code-13B is a 13 billion parameter language model, fine-tuned from the Llama-2 base model, specifically designed for enhanced code generation. Unlike many LLMs that primarily focus on generating code, this model emphasizes providing detailed explanations alongside the generated code, aiming to improve developer understanding and reduce errors.
Key Capabilities
- Multi-language Code Generation: Supports a wide array of programming languages including Python, Java, JavaScript, Go, C++, and Rust.
- Detailed Code Explanations: Generates not just the code, but also comprehensive explanations for its functionality and structure.
- Conversation-based Training: Trained on approximately 74,000 sets of two-turn conversations, formatted in Vicuna/ShareGPT style, ensuring a conversational understanding of coding requests.
- Robust Training Data: Built upon the
Code-74k-ShareGPT dataset, which includes diverse code examples paired with their explanations.
Training Details
The model underwent full fine-tuning over 3 epochs, taking 42 hours on Azure with 4 x A100 80GB GPUs, utilizing the DeepSpeed codebase. Quantized versions (GPTQ, GGUF, AWQ) are also available, thanks to TheBloke.
Performance
Evaluations on the Open LLM Leaderboard show an average score of 54.81, with notable scores in HellaSwag (83.28) and AI2 Reasoning Challenge (57.34), indicating its general reasoning capabilities in addition to its specialized coding function.
Good For
- Developers seeking code solutions that come with built-in explanations.
- Educational purposes where understanding the 'why' behind the code is as important as the code itself.
- Automating code generation tasks where clarity and detailed documentation are crucial.