Sao10K/14B-Qwen2.5-Freya-x1
Sao10K/14B-Qwen2.5-Freya-x1 is a 14.8 billion parameter language model built upon the Qwen2.5 architecture, featuring a substantial 131,072 token context length. Developed by Sao10K, this model utilizes a multi-step LoRA training methodology, initially fine-tuned on literature and raw text, then further refined with instruction data. It is optimized for generating creative text and following complex instructions, making it suitable for advanced conversational AI and content generation tasks.
Loading preview...
Sao10K/14B-Qwen2.5-Freya-x1: Multi-Step Fine-Tuning for Enhanced Performance
Sao10K/14B-Qwen2.5-Freya-x1 is a 14.8 billion parameter model based on the Qwen2.5 architecture, distinguished by its unique multi-step LoRA training approach. This methodology, inspired by multi-step training techniques, involves two distinct phases to enhance the model's capabilities.
Key Training Methodology
- Freya-S1: Initial LoRA training on approximately 1.1GB of cleaned literature and raw text, applied over the base
Qwen/Qwen2.5-14Bmodel. This phase focuses on broad language understanding and generation. - Freya-S2: The LoRA from S1 is then applied over
Qwen/Qwen2.5-14B-Instruct, followed by further training on instruction datasets. This stage refines the model's ability to follow instructions and engage in conversational tasks.
Noteworthy Features
- Extended Context: Supports a substantial context length of 131,072 tokens, enabling processing of long inputs.
- Optimized for ChatML: Recommended for use with the ChatML prompt format, with suggested
temperature: 1+andmin_p: 0.05for optimal output quality. - Diverse Training Data: Utilizes a mix of completion datasets (e.g., eBooks, novels) and chat template datasets (e.g.,
10k-amoral-full-fixed-sys.json,44k-hespera-smartshuffle.json,5k_rpg_adventure_instruct-sys.json) to build both creative and instructional capabilities.
Use Cases
This model is particularly well-suited for applications requiring detailed text generation, creative writing, and adherence to complex instructions in a conversational setting.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.