tokyotech-llm/Swallow-70b-instruct-hf
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Dec 11, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

The Swallow-70b-instruct-hf model, developed by TokyoTech-LLM, is a 70 billion parameter instruction-tuned causal language model built upon the Llama 2 architecture. It has undergone continual pre-training with a significant addition of Japanese language data, featuring a broadened vocabulary for efficient Japanese text representation and faster inference. This model excels in Japanese language tasks, demonstrating strong performance across various benchmarks including question answering, summarization, and mathematical reasoning, while maintaining competitive English capabilities.

Loading preview...