tokyotech-llm/Swallow-70b-hf
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Nov 25, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold
Swallow-70b-hf is a 70 billion parameter causal language model developed by TokyoTech-LLM, continually pre-trained from the Llama 2 family with a significant addition of Japanese language data. This model utilizes a tokenizer with a broadened vocabulary for Japanese, enabling more efficient text representation and faster inference. It excels in Japanese language tasks, demonstrating strong performance across various benchmarks including question answering, summarization, and mathematical reasoning, while maintaining competitive English capabilities.
Loading preview...