MatanBT/backdoor-model-1 is a 2.6 billion parameter causal language model fine-tuned from Google's Gemma-2-2b-it architecture. This model is a specialized iteration, though its specific fine-tuning dataset and primary differentiator are not detailed in its current documentation. It is intended for general language generation tasks, building upon the foundational capabilities of the Gemma-2-2b-it base model.
Overview
MatanBT/backdoor-model-1 is a 2.6 billion parameter language model, fine-tuned from the google/gemma-2-2b-it base model. While the specific dataset used for fine-tuning is not detailed, it builds upon the instruction-tuned capabilities of its Gemma 2 predecessor. The model was trained with a learning rate of 2e-05 over 3 epochs, utilizing adamw_torch_fused as its optimizer.
Key Characteristics
- Base Model: Fine-tuned from
google/gemma-2-2b-it. - Parameter Count: 2.6 billion parameters.
- Context Length: Supports a context window of 8192 tokens.
- Training: Trained for 3 epochs with a batch size of 4, using a linear learning rate scheduler with a 0.1 warmup ratio.
Intended Use Cases
Given its foundation, this model is suitable for various natural language processing tasks, including:
- Text Generation: Creating coherent and contextually relevant text.
- Instruction Following: Responding to prompts and instructions based on its instruction-tuned base.
- General Language Understanding: Tasks requiring comprehension of text inputs.
Further details on specific intended uses and limitations are not provided in the current documentation, suggesting it may serve as a general-purpose language model or a base for further specialization.