eachadea/vicuna-7b-1.1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 13, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

eachadea/vicuna-7b-1.1 is a 7 billion parameter open-source chatbot fine-tuned by the Vicuna team on user-shared conversations from ShareGPT. Based on the LLaMA architecture, this model is designed for research into large language models and chatbots, offering improved tokenization and loss computation compared to its predecessor. It provides a strong foundation for natural language processing and AI research.

Loading preview...

Vicuna-7b-1.1 Overview

Vicuna-7b-1.1 is a 7 billion parameter open-source chatbot developed by a collaborative team from UC Berkeley, CMU, Stanford, and UC San Diego. It is built upon the LLaMA architecture and was fine-tuned using 70,000 user-shared conversations collected from ShareGPT.com.

Key Updates and Features

  • Improved Tokenization: Version 1.1 refactors the tokenization and separator, changing the separator from "###" to the EOS token "". This enhancement simplifies the determination of generation stop criteria and improves compatibility with other libraries.
  • Enhanced Training: The model incorporates fixes to the supervised fine-tuning loss computation, contributing to better overall model quality.
  • Research-Oriented: Primarily intended for research purposes in large language models and chatbots, making it suitable for researchers and hobbyists in NLP, machine learning, and AI.

Intended Use Cases

  • LLM Research: Ideal for exploring and experimenting with chatbot capabilities and language model behaviors.
  • Chatbot Development: Provides a solid base for building and testing conversational AI applications.
  • Academic Studies: Useful for academic investigations into fine-tuning techniques and conversational data utilization.