codemateai/CodeMate-v0.1
CodeMate-v0.1 is a 34 billion parameter intelligent programming assistant developed by CodeMate, fine-tuned on a proprietary 1.8 billion token dataset of programming problems and solutions. This model excels at generating high-quality code across multiple languages, including Python, C/C++, and TypeScript. Trained with Flash Attention 2 and supporting a 32768 token context length, it is designed to assist developers with coding tasks.
Loading preview...
CodeMate-v0.1: An Intelligent Programming Assistant
CodeMate-v0.1, developed by CodeMate, is a 34 billion parameter language model specifically designed to function as an intelligent programming assistant. This model aims to generate high-quality code solutions for various programming problems.
Key Capabilities & Training
- Specialized Training Data: The model was exclusively fine-tuned on a proprietary dataset of 1.8 billion tokens, consisting of high-quality programming problems and their solutions. This dataset was manually generated and is internal to CodeMate.
- Training Efficiency: Fine-tuning utilized Flash Attention 2, conducted over 15 hours on 40 A100-80GB GPUs, with a sequence length of 8096 tokens during training.
- Multilingual Code Proficiency: CodeMate-v0.1 demonstrates proficiency across multiple programming languages, including Python, C/C++, TypeScript, Java, and others.
- Prompt Format: It accepts prompts formatted in the Alpaca/Vicuna instruction style.
Performance & Limitations
Evaluations on the Open LLM Leaderboard show an average score of 58.39, with specific results including 55.55 on AI2 Reasoning Challenge and 78.03 on HellaSwag. It scored 40.18 on GSM8k. The model is currently in version 0.1 and has undergone limited testing; CodeMate recommends additional safety testing prior to real-world deployments.