ajibawa-2023/Code-Mistral-7B
Code-Mistral-7B is a 7 billion parameter language model developed by ajibawa-2023, built upon the Mistral architecture. This model is specifically fine-tuned on a combination of code and mathematical datasets, including Code-290k-ShareGPT, Code-Feedback, orca-math-word-problems-200k, and Openhermes. It demonstrates strong performance in coding tasks, making it suitable for code generation and related applications. The model utilizes a context length of 8192 tokens and is trained using the ChatML prompt format.
Loading preview...
Code-Mistral-7B Overview
Code-Mistral-7B is a 7 billion parameter model developed by ajibawa-2023, fine-tuned on the Mistral architecture. Its training regimen emphasizes both coding and mathematical problem-solving, utilizing a diverse set of datasets including Code-290k-ShareGPT, Code-Feedback, orca-math-word-problems-200k, and Openhermes. The model was trained for 3 epochs over 33 hours on 4 x A100 80GB GPUs using the Axolotl codebase.
Key Capabilities
- Code Generation: Excels in coding tasks, demonstrating strong performance in generating and understanding code.
- Error Resolution: Capable of assisting with error identification and resolution in programming contexts.
- Mathematical Reasoning: Shows potential in mathematical problem-solving, though performance can be variable.
- ChatML Format: Designed to be used with the ChatML prompt format for conversational interactions.
Performance Highlights
Evaluated on the Open LLM Leaderboard, Code-Mistral-7B achieved an average score of 69.97. Notable scores include:
- HellaSwag (10-Shot): 85.29
- Winogrande (5-shot): 82.24
- GSM8k (5-shot): 68.08
- MMLU (5-Shot): 65.00
Good For
- Code-centric applications: Ideal for tasks requiring code generation, completion, or debugging.
- Developers: Useful for integrating into development workflows that benefit from an AI assistant with strong coding abilities.
- Experimentation: Provides a solid base for further fine-tuning on specific coding or mathematical domains.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.