tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 18, 2024License:llama3.1Architecture:Transformer0.0K Warm
The Llama-3.1-Swallow-8B-Instruct-v0.3 model by tokyotech-llm is an 8 billion parameter instruction-tuned large language model, continually pre-trained from Meta's Llama 3.1. It significantly enhances Japanese language capabilities while retaining strong English performance, utilizing approximately 200 billion tokens from Japanese web corpora, Wikipedia, and technical content. This model excels in multi-turn Japanese dialogue, achieving state-of-the-art performance on Japanese MT-Bench among open-source LLMs of comparable size.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–