Model Overview
Stevross/Astrid-7B-Assistant-CPU is a 7 billion parameter language model built upon the Mistral architecture. It is specifically configured for assistant-style text generation, making it suitable for conversational AI and general question-answering tasks. The model's architecture includes 32 MistralDecoderLayers with MistralAttention and MistralMLP components, featuring a vocabulary size of 32002 tokens.
Key Capabilities
- Assistant-style Text Generation: Optimized for producing coherent and relevant responses in a conversational format.
- Mistral Architecture: Leverages the efficient and performant Mistral base model for its underlying structure.
- CPU Deployment: Designed to support quantization (8-bit or 4-bit) and sharding, enabling efficient inference on CPU-only machines or with limited GPU resources.
- Hugging Face Transformers Integration: Fully compatible with the
transformers library for easy setup and use, including pipeline generation and direct model/tokenizer loading.
Usage Considerations
- Prompt Formatting: Requires specific prompt formatting (
<|prompt|>...<|im_end|><|answer|>) to ensure optimal performance, consistent with its training. - Customizable Generation: Supports various generation parameters such as
min_new_tokens, max_new_tokens, temperature, repetition_penalty, and num_beams for fine-grained control over output. - Disclaimer: Users should be aware of potential biases and limitations inherent in large language models trained on internet data. Responsible and ethical use is encouraged.