Model Overview
The cgato/Nemo-12b-Humanize-KTO-Experimental-Latest is an experimental 12 billion parameter language model developed by cgato, with a context length of 32768 tokens. This model serves as a public repository for the developer's latest KTO (Kahneman-Tversky Optimization) test runs, making it a platform for exploring novel training methodologies. It is configured for ChatML formatting and is noted for its minimal exposure to synthetic data, aiming for a unique conversational feel.
Key Characteristics
- Experimental Nature: This model is explicitly labeled as experimental and not release-ready, serving as a testing ground for KTO results.
- Unique Conversational Style: The primary goal is to create a model that offers a distinct conversational experience, diverging from the typical outputs of other LLMs.
- Minimal Synthetic Data: Trained with very little synthetic data, which contributes to its unique output characteristics.
- ChatML Formatting: Designed to work with ChatML for structured conversations.
Intended Use Cases
- KTO Research and Experimentation: Ideal for researchers and developers interested in exploring the effects of KTO on language model behavior.
- Unique Conversational Agents: Suitable for use cases where a non-standard, more "humanized" conversational style is desired, provided users are willing to experiment with settings.
- Testing and Development: Recommended for those who want to "mess around" with an in-development model and provide feedback on its experimental features.
Important Considerations
Users should be aware that this model is highly experimental and its functionality is not guaranteed. There is a known issue with GGUF tokenization regarding the <|im_end|> EOS token, which may impact performance for GGUF users. The developer recommends experimenting with inference parameters like Temp: 0.85, TopK: 40, TopP: 0.9, and RepPen: 1.05 for optimal results, particularly with presets like "Simple-1" in interfaces like Silly Tavern.