Local-Novel-LLM-project/Vecteus-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 1, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Vecteus-v1 is a 7 billion parameter large language model developed by Local-Novel-LLM-project, fine-tuned from Mistral-7B-v0.1. This model features an expanded 128k context window and excels at generating high-quality Japanese and English text. It is specifically designed for long-context generation with enhanced memory capabilities, making it suitable for novel writing and extended conversational tasks.

Loading preview...

Vecteus-v1: Enhanced Mistral-7B for Long-Context Novel Generation

Vecteus-v1 is a 7 billion parameter large language model developed by Local-Novel-LLM-project, built upon the Mistral-7B-v0.1 architecture. This model distinguishes itself through several key enhancements, particularly its significantly expanded context window and improved multilingual generation capabilities.

Key Capabilities

  • Extended Context Window: Features an impressive 128k context window, a substantial increase from Mistral-7B-v0.1's 8k, enabling much longer and more coherent text generation.
  • Bilingual Generation: Achieves high-quality text generation in both Japanese and English.
  • Enhanced Memory: Designed with improved memory abilities, allowing it to maintain context and coherence over extended generations, crucial for tasks like novel writing.
  • NSFW Generation: Capable of generating NSFW content.

Development & Training

Vecteus-v1 was developed with support from the first LocalAI hackathon. Its creation involved a multi-stage process including Chatvector for multiple models, simple linear merging, domain and sentence enhancement with LORA, and context expansion techniques. The model's instruction format is noted for being template-free.

Considerations

Users should be aware that the training data may introduce biases, and memory usage can be substantial during long inferences. For optimal performance, the developers recommend inferring with llamacpp over transformers where possible.