tdrussell/Llama-3-70B-Instruct-Storywriter
Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 30, 2024Architecture:Transformer0.0K Warm

tdrussell/Llama-3-70B-Instruct-Storywriter is a 70 billion parameter Llama 3 Instruct model, fine-tuned specifically on a dataset of fiction books. This model is optimized for creative writing tasks, demonstrating a significant shift in writing style towards more imaginative outputs. With an 8192 token context length, it excels in generating narrative content and can be used effectively in both instruct chat and raw completion modes.

Loading preview...

Llama 3 70B Instruct Storywriter Overview

tdrussell/Llama-3-70B-Instruct-Storywriter is a specialized variant of the Llama 3 70B Instruct model, developed by tdrussell. This model has undergone further fine-tuning using a dataset comprising approximately 800 fiction books, totaling 570 MB of raw text. The primary goal of this fine-tuning was to enhance the model's creative writing capabilities and narrative style.

Key Capabilities

  • Enhanced Creative Writing: The fine-tuning process has significantly altered the model's writing style, making it more creative and suitable for generating fictional narratives.
  • Instruction-tuned Compatibility: Retains compatibility with standard Llama 3 Instruct chat formatting, allowing for versatile interaction.
  • Flexible Usage: Can be effectively utilized in both instruction-based chat scenarios and raw completion modes.
  • 70B Parameter Base: Built upon the robust Llama 3 70B Instruct architecture, providing a strong foundation for language understanding and generation.

Training Details

The model was trained using QLoRA (Rank 64) at an 8192 sequence length, leveraging 4x 4090 GPUs with the qlora-pipe framework. While the fine-tuning aimed to boost creativity, the developer notes a potential slight decrease in overall general intelligence compared to the base Llama 3 Instruct model. Attempts to apply similar fine-tuning to the Llama 3 8B Instruct model were unsuccessful, suggesting that the 70B parameter count is more suitable for this specific type of creative specialization.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p