The rinna/gemma-2-baku-2b is a 2.6 billion parameter transformer-based language model, continually pre-trained by rinna on 80 billion tokens of mixed Japanese and English datasets. Building upon Google's Gemma 2 architecture, this model is specifically optimized to enhance performance on Japanese language tasks. It maintains an 8192-token context length and utilizes the original Gemma 2 tokenizer, making it suitable for applications requiring strong Japanese language understanding and generation.
No reviews yet. Be the first to review!