tokyotech-llm/Llama-3.3-Swallow-70B-Instruct-v0.4
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Mar 3, 2025License:llama3.3Architecture:Transformer0.0K Warm
The Llama-3.3-Swallow-70B-Instruct-v0.4 is a 70 billion parameter instruction-tuned large language model developed by tokyotech-llm, built upon Meta's Llama 3.3. This model specializes in enhancing Japanese language capabilities while maintaining strong English performance, achieved through continual pre-training on a diverse dataset including the Swallow Corpus Version 2. It is optimized for multi-turn dialogue and various Japanese and English tasks, making it suitable for applications requiring robust bilingual understanding and generation.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–