ArnavKewalram/gemma-4-E2B-coder-v1
ArnavKewalram/gemma-4-E2B-coder-v1 is a 3.9 billion parameter instruction-tuned code generation model based on Google's gemma-4-E2B-it, optimized for on-device and offline inference. It achieves 34.1% on HumanEval pass@1, matching Code Llama 7B performance at half the size, and runs on devices with as little as 4 GB RAM. This model excels at generating code in Python, JavaScript, TypeScript, Go, Rust, SQL, Bash, and C++ for laptops and edge devices.
Loading preview...
Overview
ArnavKewalram/gemma-4-E2B-coder-v1 is the first coding fine-tune of Google's gemma-4-E2B-it, a 3.9 billion parameter model. It is specifically designed for efficient, offline code generation on resource-constrained devices, running on as little as 4 GB RAM without a GPU. The model leverages the Griffin architecture, which combines local-attention and linear recurrent layers for lower latency compared to pure-transformer models of similar size.
Key Capabilities
- Code Generation: Excels in Python, JavaScript, TypeScript, Go, Rust, SQL, Bash, and C++.
- High Performance: Achieves 34.1% HumanEval pass@1, comparable to Code Llama 7B, despite being significantly smaller.
- Resource Efficient: Quantized versions (e.g., Q4_K_M at ~3.2 GB) run on CPUs and edge devices with 4 GB RAM.
- Commercial Use: Licensed under Apache 2.0, allowing unrestricted commercial applications.
- Real-world Training: Fine-tuned on 10,000 samples from the Magicoder-OSS-Instruct-75K dataset, comprising real open-source code instruction pairs from GitHub.
Good For
- Developers needing a capable coding assistant that operates fully offline.
- Applications requiring fast CPU inference on laptops or edge devices.
- Projects with strict memory constraints (e.g., 4 GB RAM minimum).
- Commercial products due to its permissive Apache 2.0 license.
Limitations
- Context Length: Trained with a maximum sequence length of 384 tokens, which may affect performance on very long code generation tasks.
- Not evaluated for security-critical code generation.
- Inherits biases and knowledge cutoff from its base model.