eachadea/vicuna-13b-1.1
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Apr 13, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold
Vicuna-13B-1.1 is a 13 billion parameter open-source chatbot developed by the Vicuna team (UC Berkeley, CMU, Stanford, UC San Diego). It is an auto-regressive language model fine-tuned on 70K user-shared conversations from ShareGPT. This model is primarily intended for research and hobbyist use in large language models and chatbots.
Loading preview...
Vicuna-13B-1.1 Overview
Vicuna-13B-1.1 is a 13 billion parameter open-source chatbot, developed by a collaborative team from UC Berkeley, CMU, Stanford, and UC San Diego. This auto-regressive language model is based on the transformer architecture and was trained between March and April 2023. Its development focused on fine-tuning LLaMA using a dataset of 70,000 user-shared conversations collected from ShareGPT.com.
Key Capabilities & Features
- Chatbot Fine-tuning: Specialized in generating human-like conversational responses due to its training on diverse user-shared dialogues.
- Research-Oriented: Primarily designed as a tool for research and experimentation in large language models and chatbot development.
- Improved Tokenization (v1.1): The v1.1 update refactors tokenization and changes the separator from "###" to the EOS token "", enhancing compatibility and generation stop criteria.
- Supervised Fine-tuning Fixes: Includes fixes for supervised fine-tuning loss computation, aiming for better model quality.
Intended Use Cases
- LLM Research: Ideal for researchers exploring new techniques in large language models.
- Chatbot Development: Suitable for hobbyists and developers building and experimenting with conversational AI applications.
- Educational Purposes: Can be used for learning and understanding transformer-based language models and fine-tuning processes.