Name: ArnavKewalram/gemma-4-E2B-coder-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ArnavKewalram

Overview

ArnavKewalram/gemma-4-E2B-coder-v1 is the first coding fine-tune of Google's gemma-4-E2B-it, a 3.9 billion parameter model. It is specifically designed for efficient, offline code generation on resource-constrained devices, running on as little as 4 GB RAM without a GPU. The model leverages the Griffin architecture, which combines local-attention and linear recurrent layers for lower latency compared to pure-transformer models of similar size.

Key Capabilities

Code Generation: Excels in Python, JavaScript, TypeScript, Go, Rust, SQL, Bash, and C++.
High Performance: Achieves 34.1% HumanEval pass@1, comparable to Code Llama 7B, despite being significantly smaller.
Resource Efficient: Quantized versions (e.g., Q4_K_M at ~3.2 GB) run on CPUs and edge devices with 4 GB RAM.
Commercial Use: Licensed under Apache 2.0, allowing unrestricted commercial applications.
Real-world Training: Fine-tuned on 10,000 samples from the Magicoder-OSS-Instruct-75K dataset, comprising real open-source code instruction pairs from GitHub.

Good For

Developers needing a capable coding assistant that operates fully offline.
Applications requiring fast CPU inference on laptops or edge devices.
Projects with strict memory constraints (e.g., 4 GB RAM minimum).
Commercial products due to its permissive Apache 2.0 license.

Limitations

Context Length: Trained with a maximum sequence length of 384 tokens, which may affect performance on very long code generation tasks.
Not evaluated for security-critical code generation.
Inherits biases and knowledge cutoff from its base model.

Overview

Overview

Key Capabilities

Good For

Limitations

Full Model Card (README)