kakaocorp/kanana-1.5-8b-base

Warm
Public
8B
FP8
8192
License: apache-2.0
Hugging Face
Overview

Kanana 1.5: Enhanced Base Model for Coding, Math, and Function Calling

Kanana 1.5, developed by KakaoCorp, represents a significant upgrade to the Kanana model family, focusing on improved performance in key technical domains. This 8 billion parameter base model is engineered to tackle more complex real-world problems through its specialized enhancements.

Key Capabilities and Features

  • Enhanced Performance: Demonstrates substantial improvements in coding, mathematics, and function calling compared to its predecessor, Kanana-8B.
  • Extended Context Length: Natively handles up to 32,768 tokens and can be configured to process up to 128,000 tokens using YaRN (Yet another RoPE extension).
  • Refined Post-Training: Delivers more natural and accurate conversations due to an optimized post-training process.
  • Performance Metrics: Achieves 61.59 on HumanEval, 57.80 on MBPP, and 63.53 on GSM8K in base model evaluations. The instruct variant shows 76.83 on HumanEval+, 67.99 on MBPP+, and 87.64 on GSM8K (0-shot).

Good For

  • Applications requiring strong code generation and understanding.
  • Tasks involving mathematical reasoning and problem-solving.
  • Use cases benefiting from function calling capabilities.
  • Processing long documents or extended conversational contexts due to its high token limit.
  • Developers seeking a robust base model for further fine-tuning in specialized technical domains.