LexGPT-V3: A Testbed for New Training Algorithms
LexGPT-V3 is a 7 billion parameter language model built upon the Mistral v0.1 architecture. Developed by lex-hue, this model served as a crucial test run to evaluate the effectiveness of a novel training algorithm and dataset. Despite being an experimental model, LexGPT-V3 demonstrates notable performance, particularly in conversational benchmarks.
Key Capabilities & Performance
- Conversational Prowess: LexGPT-V3 shows strong performance in multi-turn conversations, achieving an average score of 7.926667, placing it competitively against models like gpt-3.5-turbo and claude-v1.
- Open LLM Leaderboard: The model achieved an average score of 69.49 across various benchmarks on the Open LLM Leaderboard. Specific scores include:
- AI2 Reasoning Challenge (25-Shot): 66.47
- HellaSwag (10-Shot): 85.91
- MMLU (5-Shot): 64.48
- TruthfulQA (0-shot): 59.98
- Winogrande (5-shot): 78.53
- GSM8k (5-shot): 61.56
Intended Use
LexGPT-V3 is primarily a research and development model, showcasing the results of lex-hue's new training methodologies. While it exhibits solid performance, its main purpose was to validate training approaches rather than to be a production-ready solution. Developers interested in exploring models trained with innovative algorithms or seeking a base for further fine-tuning may find this model valuable. For detailed evaluation results, refer to the Open LLM Leaderboard.