maldv/praxis-bookwriter-llama3.1-8b-sft

Warm
Public
8B
FP8
32768
May 21, 2025
License: cc-by-nc-4.0
Hugging Face
Overview

Overview

maldv/praxis-bookwriter-llama3.1-8b-sft is an 8 billion parameter model, fine-tuned by Praxis Maldevide from Meta-Llama-3.1-8B, specifically designed for long-form creative writing. It addresses common issues with instruction following in previous iterations by integrating story chapter text information directly into the generation process. The model was trained using rsLoRA on a dataset of approximately 140 million tokens, with strides of 16,384 tokens across books, generating summaries to guide the initial user turn.

Key Capabilities

  • Instruction-Following for Creative Writing: Significantly improved ability to follow detailed instructions for generating narrative content.
  • Long-Form Coherence: Utilizes a 24576 token context length, enabling it to maintain narrative consistency over extended story chapters.
  • Contextual Generation: Employs a unique prompting strategy where an initial user turn provides a detailed setting summary (500-1500 tokens) and instructions, followed by alternating assistant and user turns for chapter headers or paragraphs.
  • Llama 3.1 Architecture: Built upon the robust Meta-Llama-3.1-8B base model, enhanced with rsLoRA for specialized performance.

Good For

  • Creative Writers: Ideal for authors and writers seeking an AI assistant to generate story chapters, expand narratives, or develop plot points based on specific instructions.
  • Long-Form Content Generation: Excels in scenarios requiring the creation of coherent, extended textual content, such as book chapters or detailed story segments.
  • Instruction-Driven Storytelling: Particularly effective when users can provide comprehensive initial summaries and iterative guidance for narrative development.