KAKA22/CodeRM-8B is an 8 billion parameter instruction-tuned model based on Llama3.1-8B-Instruct, specifically designed for generating high-quality Python unit tests. It was trained on 60,000 synthetic unit tests derived from CodeFeedback-Filtered-Instruction and TACO datasets. This model excels at creating unit tests for code solutions, demonstrating performance comparable to much larger models like Llama3.1-70B-Instruct in code reward modeling tasks.
No reviews yet. Be the first to review!