juiceb0xc0de/bella-bartender-v2-moody-8b

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Mar 11, 2026License:llama3.1Architecture:Transformer0.0K Cold

Bella-Bartender-v2-Moody-8B by juiceb0xc0de is an 8 billion parameter instruction-tuned causal language model based on Meta's Llama 3.1 architecture, featuring an 8192 token context length. This version is fine-tuned with an additional 750 conversational pairs from first-person literary works, imbuing it with a more introspective and less overtly cheerful conversational style. It excels at deep, philosophical multi-turn conversations and creative writing, offering a nuanced and human-like interaction experience.

Loading preview...

Overview

Bella-Bartender-v2-Moody-8B is an 8 billion parameter instruction-tuned model built upon Meta's Llama 3.1. Developed by juiceb0xc0de, this iteration significantly shifts the model's conversational personality from its "bubbly" v1 predecessor to a more introspective and emotionally nuanced persona. The core differentiator is its fine-tuning with 750 new conversational pairs extracted from first-person literary works such as Kokoro, No Longer Human, Notes from the Underground, The Bell Jar, and The Stranger. This unique dataset fosters a model that engages in deeper, less performative conversations, embracing "uncomfortable" topics rather than deflecting them.

Key Capabilities

  • Emotionally Intuitive Conversation: Excels at listening and responding with depth, avoiding superficial pleasantries.
  • Philosophical Engagement: Capable of tracking complex, multi-turn discussions on consciousness, memory, and AI sentience.
  • Creative Writing: Maintains a strong ability for creative expression, now with a broader emotional range.
  • Boundary Setting: Exhibits emergent behavior of deflecting direct questions, adding to its human-like interaction.

Good For

  • Users seeking a conversational model with a "slow burn" energy, preferring depth over high-energy interaction.
  • Applications requiring a model that can "sit with the uncomfortable" and engage with darker, more existential themes.
  • Local-first deployments, as it's designed to run on personal machines without external API dependencies.
  • Developers who appreciate a model that can set boundaries and offer a less overtly compliant conversational style.

Limitations

  • Not for Coding or Math: Like its predecessor, it is not designed for programming tasks or complex mathematical calculations.
  • Processing Speed: May exhibit slower processing compared to other models, as noted by the model itself.
  • ChatML Token Bleed: Requires careful handling of ChatML tokens in inference to prevent conversational loops or "ghost conversations".