shisa-ai/shisa-v1-llama3-8b
shisa-ai/shisa-v1-llama3-8b is an 8 billion parameter Llama 3-based instruction-tuned causal language model developed by shisa-ai. Fine-tuned from Meta-Llama-3-8B-Instruct, this model demonstrates strong performance on Japanese language benchmarks, achieving an average score of 6.59 across ELYZA100, JA MT-Bench, Rakuda, and Tengu-Bench. It is optimized for general-purpose Japanese language tasks, offering a competitive option in its size class.
Loading preview...
shisa-v1-llama3-8b: A Llama 3-based Japanese-Optimized LLM
shisa-v1-llama3-8b is an 8 billion parameter instruction-tuned model built upon Meta's Llama 3-8B-Instruct architecture. Developed by shisa-ai, this model has undergone fine-tuning to enhance its performance, particularly in Japanese language understanding and generation.
Key Capabilities & Performance
This model demonstrates competitive performance on several Japanese benchmarks, with the shisa-v1-llama3-8b (8-e6) variant achieving an average score of 6.59. Specific benchmark results include:
- ELYZA-tasks-100: 6.67
- JA MT-Bench: 6.95
- Rakuda: 7.05
- Tengu-Bench: 5.68
These scores position it favorably against other 7B-14B parameter models in Japanese contexts, such as lightblue/suzume-llama-3-8B-japanese and augmxnt/shisa-gamma-7b-v1.
Training Details
The model was fine-tuned using the augmxnt/ultra-orca-boros-en-ja-v1 dataset, leveraging the Axolotl framework. Training involved a learning rate of 8e-06 over 3 epochs, with a sequence length of 8192 tokens. The training process utilized 8 GPUs with a total batch size of 64.
Intended Use Cases
Given its strong performance on Japanese benchmarks, shisa-v1-llama3-8b is well-suited for applications requiring robust Japanese language processing, including but not limited to:
- General-purpose conversational AI in Japanese
- Text generation and summarization for Japanese content
- Japanese language understanding tasks
Compute resources for training were provided by Ubitus.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.