TheBloke/Minotaur-13B-fixed-SuperHOT-8K-fp16
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:otherArchitecture:Transformer0.0K Cold
TheBloke/Minotaur-13B-fixed-SuperHOT-8K-fp16 is a 13 billion parameter language model, created by TheBloke, merging OpenAccess AI Collective's Minotaur 13B Fixed with Kaio Ken's SuperHOT 8K. This model is designed for extended context applications, supporting an 8K context length, and is suitable for general instruction-following tasks. It is an fp16 pytorch format model, optimized for GPU inference and further conversions.
Loading preview...
Model Overview
This model, TheBloke/Minotaur-13B-fixed-SuperHOT-8K-fp16, is a 13 billion parameter instruct fine-tuned model. It is a merge of OpenAccess AI Collective's Minotaur 13B Fixed and Kaio Ken's SuperHOT 8K.
Key Capabilities & Features
- Extended Context Window: Achieves an 8K context length during inference by leveraging the SuperHOT 8K merge and
trust_remote_code=True. - Instruction Following: Fine-tuned on a variety of completely open datasets, including WizardLM, Alpaca-CoT, GPTeacher-General-Instruct, and several academic datasets for math, science, and summarization.
- Reproducible Training: Minotaur 13B Fixed was trained exclusively on openly available datasets, ensuring reproducibility.
- Fixed Training Bug: The base Minotaur 13B Fixed model corrected an issue where initial training runs dropped datasets related to prose generation, classification, and coding.
Use Cases & Strengths
- General Purpose Chat: Designed for chat-style prompts using
USER:andASSISTANT:formats. - Reasoning Tasks: Includes training data from mathematical and scientific domains, as well as datasets like GSM8K and ARC-Challenge, indicating capabilities in reasoning and problem-solving.
- Creative Text Generation: Examples show ability to generate short stories and haikus, though quality may vary.
Limitations
- The model has not undergone alignment with human preferences (e.g., RLHF), meaning it may produce problematic outputs if prompted to do so.
- Inherits limitations from its base model, LLaMA-13B.