KKHYA/qwen3-14b-fft-if

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:May 24, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

KKHYA/qwen3-14b-fft-if is a 14 billion parameter language model fine-tuned from Qwen/Qwen3-14B, featuring a 32768 token context length. This model has been specifically fine-tuned across multiple datasets including mft_tulu3_personas_if, mft_oasst1, mft_oasst2, mft_coconot, mft_aya, and mft_daring_anteater. It is designed for general language understanding and generation tasks, leveraging its diverse training data for broad applicability.

Loading preview...

Overview

KKHYA/qwen3-14b-fft-if is a 14 billion parameter large language model, building upon the robust Qwen3-14B architecture. It distinguishes itself through a comprehensive fine-tuning process across a diverse set of datasets, including mft_tulu3_personas_if, mft_oasst1, mft_oasst2, mft_coconot, mft_aya, and mft_daring_anteater. This extensive training aims to enhance its general conversational abilities and instruction following.

Key Capabilities

  • General Language Understanding: Benefits from the Qwen3 base model's strong foundation in comprehending complex language patterns.
  • Instruction Following: Improved through fine-tuning on instruction-based datasets like mft_oasst1 and mft_oasst2.
  • Conversational Fluency: Enhanced by datasets focused on persona-based interactions and general dialogue.
  • Broad Applicability: The diverse training data suggests suitability for a wide range of natural language processing tasks.

Training Details

The model was trained with a learning rate of 1e-05, a total batch size of 128, and utilized a cosine learning rate scheduler with a 0.1 warmup ratio over 2 epochs. This configuration, combined with an AdamW optimizer, aims for stable and effective learning across the varied fine-tuning datasets.

Good For

  • General-purpose chatbots: Its fine-tuning on conversational and instruction datasets makes it suitable for interactive applications.
  • Content generation: Can be used for generating text, summaries, or creative content based on prompts.
  • Research and experimentation: Provides a solid base for further fine-tuning on more specific downstream tasks.