eachadea/vicuna-13b-1.1

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Apr 13, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

Vicuna-13B-1.1 is a 13 billion parameter open-source chatbot developed by the Vicuna team (UC Berkeley, CMU, Stanford, UC San Diego). It is an auto-regressive language model fine-tuned on 70K user-shared conversations from ShareGPT. This model is primarily intended for research and hobbyist use in large language models and chatbots.

Loading preview...

Vicuna-13B-1.1 Overview

Vicuna-13B-1.1 is a 13 billion parameter open-source chatbot, developed by a collaborative team from UC Berkeley, CMU, Stanford, and UC San Diego. This auto-regressive language model is based on the transformer architecture and was trained between March and April 2023. Its development focused on fine-tuning LLaMA using a dataset of 70,000 user-shared conversations collected from ShareGPT.com.

Key Capabilities & Features

  • Chatbot Fine-tuning: Specialized in generating human-like conversational responses due to its training on diverse user-shared dialogues.
  • Research-Oriented: Primarily designed as a tool for research and experimentation in large language models and chatbot development.
  • Improved Tokenization (v1.1): The v1.1 update refactors tokenization and changes the separator from "###" to the EOS token "", enhancing compatibility and generation stop criteria.
  • Supervised Fine-tuning Fixes: Includes fixes for supervised fine-tuning loss computation, aiming for better model quality.

Intended Use Cases

  • LLM Research: Ideal for researchers exploring new techniques in large language models.
  • Chatbot Development: Suitable for hobbyists and developers building and experimenting with conversational AI applications.
  • Educational Purposes: Can be used for learning and understanding transformer-based language models and fine-tuning processes.