tzchen07/g2_X9e
The tzchen07/g2_X9e model is a 2.6 billion parameter language model, fine-tuned from unsloth/gemma-2-2b-it. It was trained on a combination of v1.6, v1.6b, v1.6c, v1.6d, and v1.6e datasets. This model is designed for general language generation tasks, leveraging its Gemma-2 base architecture for efficient performance.
Loading preview...
Overview
This model, tzchen07/g2_X9e, is a 2.6 billion parameter language model derived from the unsloth/gemma-2-2b-it base. It has undergone fine-tuning across multiple datasets, specifically v1.6, v1.6b, v1.6c, v1.6d, and v1.6e, to enhance its performance and adaptability.
Training Details
The fine-tuning process utilized specific hyperparameters to optimize its learning:
- Learning Rate: 5e-06
- Batch Size: 4 (train), 8 (eval)
- Gradient Accumulation: 16 steps, resulting in a total effective batch size of 64
- Optimizer: ADAMW_TORCH_FUSED with standard betas and epsilon
- LR Scheduler: Cosine type with a 0.1 warmup ratio
- Epochs: 2.0
Intended Use
While specific intended uses and limitations are not detailed in the provided information, as a fine-tuned Gemma-2 variant, it is generally suitable for a range of natural language processing tasks, including text generation, summarization, and question answering, within its 8192 token context window.