Lemur-7B: A LLaMA-based Chatbot Model
Lemur-7B is a 7 billion parameter chatbot model developed by tianyang, built upon the LLaMA architecture. It has been fine-tuned using the LoRA method, leveraging several publicly available datasets such as Alpaca-GPT4, Baize, ShareGPT, and Vicuna-Dummy-Conversation to enhance its conversational abilities.
Key Characteristics
- Base Model: LLaMA architecture.
- Fine-tuning Method: LoRA (Low-Rank Adaptation).
- Training Data: Utilizes a diverse set of conversational datasets including Alpaca-GPT4, Baize, ShareGPT, and Vicuna-Dummy-Conversation.
- Intended Use: Specifically designed for chatbot applications, aiming to provide helpful, detailed, and polite responses.
Important Considerations
- Project-Specific Design: Lemur-7B was created exclusively for a final project in CSE 256 Statistical Natural Language Processing at UCSD.
- Usage Restrictions: It is not intended for commercial usage or widespread deployment. Testing and exploration are permitted, but users are requested to limit its use to academic and non-commercial purposes only.
- Data Source Warning: The use of ShareGPT data in its training might involve copyright issues, which are currently under dispute. Users should proceed with caution regarding this aspect.
Model Format
The model expects a specific prompt format for interaction:
MSG = "Hi, how are you?"
prompt = f"""A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.
[Human]: {MSG}
[AI]:"""