nyoshida/vicuna-13b-1.1

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer Open Weights Cold

nyoshida/vicuna-13b-1.1 is a 13 billion parameter open-source chatbot developed by the Vicuna team, fine-tuned from LLaMA on 70K user-shared conversations from ShareGPT. This auto-regressive language model, based on the transformer architecture, is primarily intended for research and hobbyist use in large language models and chatbots. Key updates in v1.1 include refactored tokenization and separator changes for improved compatibility and refined supervised fine-tuning loss computation for better model quality.

Loading preview...

Vicuna-13b-1.1: An Open-Source Chatbot for Research

nyoshida/vicuna-13b-1.1 is a 13 billion parameter open-source chatbot developed by the Vicuna team, comprising members from UC Berkeley, CMU, Stanford, and UC San Diego. This model is an auto-regressive language model built upon the transformer architecture, fine-tuned specifically for conversational AI.

Key Capabilities & Features

  • Foundation Model: Fine-tuned from the LLaMA architecture.
  • Training Data: Utilizes 70,000 user-shared conversations collected from ShareGPT.com, enhancing its conversational abilities.
  • Research Focus: Primarily designed for research and development in large language models and chatbots.
  • Evaluation Method: Preliminary quality assessment conducted using GPT-4 to judge outputs on a set of 80 diverse questions.

Version 1.1 Updates

  • Tokenization Refinement: The separator has been changed from "###" to the EOS token "</s>", simplifying stop criteria and improving compatibility with other libraries.
  • Improved Model Quality: Supervised fine-tuning loss computation has been fixed, leading to better overall model performance.

Intended Use Cases

  • Research: Ideal for academic and independent research into large language models and conversational AI.
  • Hobbyist Projects: Suitable for enthusiasts exploring natural language processing, machine learning, and artificial intelligence applications.