lightblue/karasu-1.1B

Loading
Public
1.1B
BF16
2048
Jan 17, 2024
License: apache-2.0
Hugging Face
Overview

Overview

lightblue/karasu-1.1B is a 1.1 billion parameter language model developed by Lightblue Technology. It is built upon the TinyLlama base model, specifically the TinyLlama/TinyLlama-1.1B-intermediate-step-715k-1.5T checkpoint. The model has undergone extensive additional pre-training for 50,004 steps using a filtered and sampled set of Japanese datasets, including OSCAR (Japanese) and mC4 (Japanese), totaling approximately 3 billion tokens.

Key Capabilities

  • Japanese Language Proficiency: Enhanced understanding and generation of Japanese text due to specialized pre-training.
  • Compact Size: At 1.1 billion parameters, it offers a balance between performance and computational efficiency.
  • Causal Language Modeling: Designed for text generation tasks, predicting the next token in a sequence.

Good For

  • Applications requiring a lightweight model with strong Japanese language capabilities.
  • Text generation, summarization, and conversational AI in Japanese.
  • Developers looking for a pre-trained base for further fine-tuning on specific Japanese tasks.

Usage

The model can be easily integrated and used with popular libraries like Hugging Face Transformers and VLLM, supporting efficient inference for various use cases.