NoesisLab/Kai-30B-Instruct
Kai-30B-Instruct by NoesisLab is a 32.8 billion parameter instruction-tuned language model built on the Qwen2ForCausalLM architecture with a 32K context length. It is specifically optimized for reasoning, mathematical tasks, and code generation, leveraging an Adaptive Dual-Search Distillation (ADS) technique. This model demonstrates strong performance in benchmarks like Winogrande, surpassing larger models in certain common sense reasoning tasks. It is designed for applications requiring robust analytical and generative capabilities across these domains.
Loading preview...
Kai-30B-Instruct: Optimized for Reasoning, Math, and Code
NoesisLab's Kai-30B-Instruct is a 32.8 billion parameter instruction-tuned language model, part of the Kai family, built upon the Qwen2ForCausalLM architecture. It features a substantial 32,768 token context length and utilizes Grouped-Query Attention (GQA) with 8 KV heads for efficient processing. A key differentiator is its optimization for reasoning, mathematics, and code generation tasks, achieved through a novel Adaptive Dual-Search Distillation (ADS) technique.
Key Capabilities & Differentiators
- ADS Technique: Employs Adaptive Dual-Search Distillation, which treats fine-tuning as a constrained optimization problem. This method uses a dynamic loss function with a stateful dual penalty factor to enhance convergence to high-confidence predictions in complex reasoning scenarios without altering the model architecture.
- Strong Benchmark Performance: Achieves notable results in reasoning benchmarks, particularly scoring 86.4 on Winogrande, outperforming Llama-3 70B, Qwen2.5 32B, and Yi-34B in this specific common sense reasoning task.
- Optimized for Specific Domains: Designed with a focus on excelling in reasoning, mathematical problem-solving, and generating code.
Ideal Use Cases
- Complex Reasoning: Applications requiring advanced logical deduction and problem-solving.
- Mathematical Computations: Tasks involving numerical analysis and accurate mathematical responses.
- Code Generation: Development of tools for generating or assisting with programming code.