N8Programs/Coxcomb

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 17, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

N8Programs/Coxcomb is a 7 billion parameter Mistral-based language model, fine-tuned on GPT-4 outputs for creative writing tasks. Developed by N8Programs, it excels in generating diverse story content and achieves a creative writing benchmark score of 72.37, outperforming larger models. This model is specifically designed for single-shot creative writing interactions and is optimized for local, offline environments.

Loading preview...

Coxcomb: A Creative Writing Specialist

Coxcomb is a 7 billion parameter language model developed by N8Programs, built upon the senseable/WestLake-7B-v2 base. It has been meticulously fine-tuned on GPT-4 outputs from a diverse range of prompts, specifically targeting creative writing applications. While not intended to compete with GPT-4 in overall quality, Coxcomb is optimized for performance in local, offline environments.

Key Capabilities & Performance

  • Creative Writing Excellence: Coxcomb is highly specialized for generating creative text, such as stories. It is noted for consistently ranking higher than many other models on creative writing benchmarks.
  • Benchmark Score: Achieves a score of 72.37 on creative writing benchmarks, surpassing models like Goliath-120B, Yi Chat, and Mistral-Large.
  • Single-Shot Interactions: Designed for direct, single-query story generation rather than multi-turn conversations, roleplay, or follow-up questions.
  • Content Generation: Capable of generating NSFW content (sexual or violent) when prompted, as it has not been trained with refusal behaviors.

Training Details

  • Base Model: senseable/WestLake-7B-v2 (Mistral architecture).
  • Fine-tuning: Trained with a 40M parameter LoRA on the N8Programs/CreativeGPT dataset for 3 epochs, with intentional slight overfitting for improved benchmark results.
  • Efficiency: The model was trained on a single M3 Max in approximately 12 hours.

Considerations

  • Limitations: Tends to produce stories with happy, conventional endings, a common characteristic among many LLMs.
  • Deployment: GGUF versions are available for local deployment (Coxcomb-GGUF), and it is expected to work with transformers.