Name: baya1116/Phase15-DeepSeek-FFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: baya1116

Overview

baya1116/Phase15-DeepSeek-FFT is an experimental, work-in-progress training snapshot of an on-device reasoning model. It's built upon a TinyLlama-1.1B base and incorporates a novel architecture featuring a HyperNetwork-driven soft prompt and a dynamic raw-token window. The primary goal is deployment on iPhones, targeting a ~3GB RAM limit, achieved by distilling knowledge from DeepSeek-R1 traces.

Key Capabilities & Architecture

On-device optimization: Designed for resource-constrained environments like iPhones.
Hybrid input: Combines a 128-soft-token prompt generated by a HyperNetwork with a small, curriculum-trained raw-token window (currently at 8 tokens, progressing to 16).
Recurrent soft-prompt update: The soft prompt (sp_k) is updated recurrently based on the previous soft prompt and the last raw token.
Distillation: Trained using traces from cognitivecomputations/dolphin-r1, which is a DeepSeek-R1 derivative.
Curriculum learning: The raw_window size increases (1 -> 2 -> 4 -> 8 -> 16 -> 32) upon reaching performance plateaus.
Auxiliary loss: Applied at the last soft prompt position and each raw token position to enhance training.

Current Status & Limitations

Work-in-progress: This is a training snapshot (step 484), not a final release.
Coherent prose: Currently shows promise in generating coherent prose for advice-related questions.
Arithmetic/Code: Struggles with math and code generation due to the TinyLlama base model's limitations.
Closure problem: The model sometimes fails to reliably close <think> tags.
Training: Trained on a single RTX 3090 GPU with a batch size of 24-32.

Overview

Overview

Key Capabilities & Architecture

Current Status & Limitations

Full Model Card (README)