Name: the-ai-alchemist/DeepSeek-R1-Distill-Qwen-14B-nethack API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: the-ai-alchemist

Model Overview

the-ai-alchemist/DeepSeek-R1-Distill-Qwen-14B-nethack is a 14.8 billion parameter language model, fine-tuned and converted into the GGUF format. This model leverages the DeepSeek-R1-Distill-Qwen architecture and was developed by the-ai-alchemist.

Key Characteristics

Architecture: Based on the DeepSeek-R1-Distill-Qwen model family.
Parameter Count: Features 14.8 billion parameters, offering a balance between performance and computational requirements.
GGUF Format: Provided in GGUF format, making it compatible with llama.cpp and other GGUF-supporting inference engines for efficient CPU and GPU inference.
Optimization: The model underwent fine-tuning and GGUF conversion using Unsloth, which facilitated a 2x faster training process.
Context Length: Supports a context length of 32768 tokens.

Usage and Compatibility

This model is designed for use with llama-cli for text-only applications and llama-mtmd-cli for potential multimodal applications, utilizing Jinja templating. A specific GGUF file, DeepSeek-R1-Distill-Qwen-14B.Q8_0.gguf, is available for download. Adjustments were made to the model's BOS token behavior to ensure full GGUF compatibility.

Overview

Model Overview

Key Characteristics

Usage and Compatibility

Full Model Card (README)