Name: issai/foggen API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: issai

What is FogGen?

FogGen is a 0.8 billion parameter self-aware edge LLM developed by issai, built upon the Qwen3-0.6B base model. Its core innovation is the ability to emit a calibrated confidence score alongside its answer in a single forward pass. This allows the model to intelligently decide whether to provide a local answer or route the query to a more powerful cloud model, optimizing for both speed and accuracy in edge-cloud deployments.

Key Capabilities

Self-Aware Routing: Integrates confidence estimation directly into the inference process, eliminating the need for an external router.
Efficient Resource Utilization: Routes only necessary queries to the cloud, reducing latency and computational costs.
Self-Evolving Training: Utilizes a unique 14-round sequential training loop (LoRA SFT) where the model self-samples generations to derive confidence buckets and fine-tunes on (question, confidence, answer) triples.
Domain Specialization: Trained across seven diverse domains: finance, science, coding, law, math, Kazakh culture, and medical.
High System Accuracy: Achieves a mean system accuracy of 67.8% at a routing threshold (τ) of 0.5, routing only 21.9% of queries to the cloud, demonstrating a +4.6% lift over random routing.
Superior Performance: Outperforms AutoMix with higher system accuracy, lower cloud routing percentage, and 9x lower per-query inference cost (1 forward pass vs. 9).

Good For

Edge AI Applications: Ideal for scenarios where local processing is preferred but complex queries require cloud assistance.
Cost-Sensitive Deployments: Reduces cloud API calls by intelligently filtering queries.
Real-time Decision Making: Provides fast local responses for high-confidence queries.
Multi-Domain Question Answering: Excels in specialized domains like finance, coding, and medical, with demonstrated generalization to open-ended tasks like SQuAD and GSM8K.

Overview

What is FogGen?

Key Capabilities

Good For

Full Model Card (README)