v000000/Qwen2.5-Lumen-14B

Warm
Public
14.8B
FP8
131072
License: apache-2.0
Hugging Face
Overview

Overview

v000000/Qwen2.5-Lumen-14B is a 14.8 billion parameter model built on the Qwen2.5-14B-Instruct base, specifically optimized through direct preference optimization (DPO) and model merging techniques. The training involved approximately three epochs on datasets like jondurbin/gutenberg-dpo-v0.1 and HuggingFaceH4/ultrafeedback_binarized. This extensive finetuning and merging process, which included various DPO checkpoints and SLERP gradients, aimed to enhance specific generative qualities.

Key Capabilities

  • Enhanced Prompt Adherence: The model is designed to follow user prompts closely, ensuring outputs align with specified instructions.
  • Advanced Story Writing: Excels in generating coherent and engaging long-form narratives, as demonstrated by its ability to produce detailed romance and dark fantasy chapters.
  • Roleplay Proficiency: Optimized for character-based interactions and roleplaying scenarios, making it suitable for interactive storytelling applications.
  • Extended Context Window: Supports a substantial context length of 131,072 tokens for input and 8,192 tokens for generation, allowing for complex and lengthy conversations or document processing.

Good For

  • Creative Writing: Ideal for authors, content creators, and developers needing assistance with generating novel chapters, short stories, or other narrative content.
  • Roleplaying Applications: Well-suited for chatbots, virtual companions, and interactive fiction platforms that require strong character consistency and dynamic dialogue.
  • Long-form Content Generation: Its large context window makes it effective for tasks requiring the processing and generation of extensive text, such as summarizing long documents or drafting detailed reports.