Name: AscendKernelGen/KernelGen-LM-1.7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AscendKernelGen

AscendKernelGen/KernelGen-LM-1.7B Overview

KernelGen-LM-1.7B is a specialized large language model developed by AscendKernelGen, designed for generating low-level NPU kernels for the Huawei Ascend architecture using the AscendC programming language. Built upon the Qwen3-1.7B foundation, this model undergoes a unique two-stage domain-adaptive post-training process, including Supervised Fine-Tuning (SFT) with error-derived supervision and Reinforcement Learning (RL) using Direct Preference Optimization (DPO) driven by execution-based correctness and performance signals.

Key Capabilities & Innovations

Domain-Specific Training: Utilizes the high-quality, domain-specific Ascend-CoT Dataset, which incorporates Chain-of-Thought (CoT) reasoning from documentation, code-centric analysis, and general reasoning chains.
Hardware-Grounded Evaluation: Performance is rigorously validated using NPUKernelBench, a comprehensive benchmark assessing compilation success, functional correctness, and latency on real Ascend hardware.
Enhanced Kernel Generation: Demonstrates significant qualitative improvement in generating complex NPU kernels, particularly for Level-2 tasks, by accurately understanding AscendC-specific APIs, data layout constraints, and multi-core parallelization strategies like tiling.
Superior Performance: Outperforms general-purpose models (e.g., Qwen3, Llama3.1) on complex NPU kernel generation, effectively solving tasks where baselines completely fail.

Ideal Use Cases

This model is ideal for developers and researchers focused on:

Automating the generation of highly optimized, hardware-specific kernels for Huawei Ascend NPUs.
Bridging the gap between high-level AI models and low-level hardware programming.
Developing and optimizing custom operations for neural processing units where precise control over hardware resources is critical.

Overview

AscendKernelGen/KernelGen-LM-1.7B Overview

Key Capabilities & Innovations

Ideal Use Cases

Full Model Card (README)