pankajmathur/RenCoder-Devstral-Small-2507

TEXT GENERATIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Dec 18, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

RenCoder-Devstral-Small-2507 by pankajmathur is a 24 billion parameter language model, fine-tuned from mistralai/Devstral-Small-2507. It utilizes Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) methods like DPO and GRPO. This model is specifically optimized for agentic coding tasks, trained on datasets such as SWE-Bench and NVIDIA Terminal Corpus, making it highly suitable for code generation and automated programming environments.

Loading preview...

RenCoder-Devstral-Small-2507 Overview

RenCoder-Devstral-Small-2507 is a 24 billion parameter language model developed by pankajmathur. It is built upon the mistralai/Devstral-Small-2507 base model and has undergone further training using a combination of Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) techniques, specifically DPO (Direct Preference Optimization) and GRPO.

Key Capabilities

  • Agentic Coding: Optimized for tasks requiring autonomous code generation and interaction, leveraging training on specialized datasets.
  • Enhanced Performance: Benefits from SFT and RLHF on agentic coding datasets like SWE-Bench and NVIDIA Terminal Corpus, aiming to improve its coding proficiency.
  • Base Model Heritage: Inherits the strong foundational capabilities of the mistralai/Devstral-Small-2507 architecture.

Good For

  • Automated Code Generation: Ideal for applications requiring models to generate or complete code in an agentic fashion.
  • Developer Tools: Suitable for integration into tools that assist with programming tasks, debugging, or automated development workflows.
  • Research in RLHF for Code: Provides a strong base for further experimentation and development in reinforcement learning applied to coding models.

This model operates with bfloat16 precision and is released under the Apache 2.0 license, inherited from its base model.