jylee420/gemma-2b-data-std
The jylee420/gemma-2b-data-std model is a 2.5 billion parameter base version of the Gemma architecture, developed by [email protected]. This model has undergone additional pre-training with 6 million tokens, focusing on Korean and English languages. It is designed for causal language modeling tasks, offering a context length of 8192 tokens.
Loading preview...
Overview
This model is an additional pre-trained version of the Gemma 2B base model, developed by [email protected]. It has been further trained with 6 million tokens, specifically incorporating Korean and English language data. The model is designed for causal language modeling and supports a context length of 8192 tokens.
Key Characteristics
- Model Type: Gemma 2B base architecture.
- Parameter Count: 2.5 billion parameters.
- Context Length: 8192 tokens.
- Training: Additional pre-training with 6 million tokens.
- Language Support: Fine-tuned with Korean and English data.
Usage
The model can be loaded using AutoModelForCausalLM.from_pretrained with specified vocab_size, torch_dtype, and device_map. While specific direct and downstream use cases are not detailed in the provided information, its base Gemma architecture and additional multilingual pre-training suggest applicability in various language generation and understanding tasks, particularly in Korean and English contexts.