Sentdex/WSB-GPT-13B

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 31, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Sentdex/WSB-GPT-13B is a 13 billion parameter Llama 2 Chat model fine-tuned by Sentdex. It was fine-tuned using QLoRA on /r/wallstreetbets subreddit comments from 2017-2018, giving it a distinct, character-filled conversational style. The model is primarily intended as a fun chatbot for exploring QLoRA and specific community language, with a context length of 4096 tokens.

Loading preview...

Overview

Sentdex/WSB-GPT-13B is a 13 billion parameter Llama 2 Chat model, fine-tuned by Sentdex using QLoRA. The fine-tuning dataset consists of comments and responses from the /r/wallstreetbets subreddit, specifically from the 2017-2018 era. This process was undertaken to explore QLoRA and imbue the model with a unique, character-driven conversational style.

Key Capabilities

  • Distinct Conversational Style: Exhibits language patterns and humor characteristic of the /r/wallstreetbets community from its training period.
  • QLoRA Exploration: Serves as a practical example and learning tool for QLoRA fine-tuning techniques.
  • Chatbot Functionality: Designed to engage in chat-based interactions, reflecting its specialized training.

Intended Use and Limitations

This model's primary purpose is for fun chatbot interactions and as a learning resource for QLoRA. It is explicitly noted that the model may use language that some users find abrasive or offensive, reflecting the nature of its training data. It is not intended for general-purpose applications or audiences sensitive to potentially strong language. The model's multilingual capabilities from its Llama 2 base may be affected by the English-centric fine-tuning.