Overview

This model is an additional pre-trained version of the Gemma 2B base model, developed by [email protected]. It has been further trained with 6 million tokens, specifically incorporating Korean and English language data. The model is designed for causal language modeling and supports a context length of 8192 tokens.

Key Characteristics

Model Type: Gemma 2B base architecture.
Parameter Count: 2.5 billion parameters.
Context Length: 8192 tokens.
Training: Additional pre-training with 6 million tokens.
Language Support: Fine-tuned with Korean and English data.

Usage

The model can be loaded using AutoModelForCausalLM.from_pretrained with specified vocab_size, torch_dtype, and device_map. While specific direct and downstream use cases are not detailed in the provided information, its base Gemma architecture and additional multilingual pre-training suggest applicability in various language generation and understanding tasks, particularly in Korean and English contexts.

Overview

Overview

Key Characteristics

Usage

Full Model Card (README)