kakaocorp/kanana-1.5-8b-base

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 15, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The kakaocorp/kanana-1.5-8b-base is an 8 billion parameter base model from the Kanana 1.5 family, developed by KakaoCorp. This model features substantial enhancements in coding, mathematics, and function calling capabilities. It natively supports up to 32K token context length and can be extended to 128K tokens using YaRN, making it suitable for complex real-world problems requiring extensive document handling or long conversations. The model also delivers more natural and accurate conversations through a refined post-training process.

Loading preview...

Kanana 1.5: Enhanced Base Model for Coding, Math, and Function Calling

Kanana 1.5, developed by KakaoCorp, represents a significant upgrade to the Kanana model family, focusing on improved performance in key technical domains. This 8 billion parameter base model is engineered to tackle more complex real-world problems through its specialized enhancements.

Key Capabilities and Features

  • Enhanced Performance: Demonstrates substantial improvements in coding, mathematics, and function calling compared to its predecessor, Kanana-8B.
  • Extended Context Length: Natively handles up to 32,768 tokens and can be configured to process up to 128,000 tokens using YaRN (Yet another RoPE extension).
  • Refined Post-Training: Delivers more natural and accurate conversations due to an optimized post-training process.
  • Performance Metrics: Achieves 61.59 on HumanEval, 57.80 on MBPP, and 63.53 on GSM8K in base model evaluations. The instruct variant shows 76.83 on HumanEval+, 67.99 on MBPP+, and 87.64 on GSM8K (0-shot).

Good For

  • Applications requiring strong code generation and understanding.
  • Tasks involving mathematical reasoning and problem-solving.
  • Use cases benefiting from function calling capabilities.
  • Processing long documents or extended conversational contexts due to its high token limit.
  • Developers seeking a robust base model for further fine-tuning in specialized technical domains.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p