dty1aaa/codellama-7b-instruct-hf-sft

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 30, 2026License:otherArchitecture:Transformer Cold

The dty1aaa/codellama-7b-instruct-hf-sft model is a fine-tuned version of CodeLlama-7b-Instruct-hf, developed by dty1aaa. This 7 billion parameter instruction-tuned model is specifically trained on the EFFIINSTRUCT_python dataset. It is optimized for generating effective Python code, building upon the foundational capabilities of the CodeLlama architecture.

Loading preview...

Overview

This model, dty1aaa/codellama-7b-instruct-hf-sft, is a specialized instruction-tuned variant of the CodeLlama-7b-Instruct-hf architecture. It has been fine-tuned by dty1aaa with a focus on code generation, specifically for Python.

Key Capabilities

  • Python Code Generation: The model is trained on the EFFIINSTRUCT_python dataset, making it proficient in generating effective Python code based on instructions.
  • Instruction Following: As an instruction-tuned model, it is designed to understand and respond to natural language prompts for coding tasks.

Training Details

The model underwent supervised fine-tuning (SFT) with a learning rate of 5e-06 over 4 epochs. It achieved a final validation loss of 0.3331. The training utilized a batch size of 8, with gradient accumulation steps of 2, resulting in a total training batch size of 64 across 4 GPUs. The optimizer used was Adam with cosine learning rate scheduling.

Intended Use Cases

This model is primarily intended for developers and researchers who need to generate effective Python code. Its fine-tuning on a dedicated Python instruction dataset suggests strong performance in tasks requiring code completion, generation from natural language, and potentially code explanation within the Python ecosystem.