artificialguybr/QWEN-2-1.5B-Synthia-I

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Nov 13, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The artificialguybr/QWEN-2-1.5B-Synthia-I is a 1.5 billion parameter causal language model, fine-tuned by artificialguybr from the Qwen2-1.5B base model. It specializes in instruction following and task completion, leveraging training on the 20.7k instruction-following examples from the Synthia v1.5-I dataset. This model enhances language understanding, generation, and structured data processing, making it suitable for conversational AI applications.

Loading preview...

Model Overview

This model, artificialguybr/QWEN-2-1.5B-Synthia-I, is a fine-tuned version of the Qwen2-1.5B base model, developed by artificialguybr. It has 1.5 billion parameters and is built upon the Qwen2 series architecture, known for its advancements in language understanding, generation, and multi-language support.

Key Capabilities

  • Enhanced Instruction Following: Specifically fine-tuned on the Synthia v1.5-I dataset, comprising over 20.7k instruction-following examples.
  • Text Generation and Completion: Excels at generating coherent and contextually relevant text.
  • Conversational AI: Designed for applications requiring interactive dialogue and task completion.
  • Structured Data Processing: Inherits the base model's improvements in handling structured information.
  • Long Context Handling: Benefits from the Qwen2 base model's capabilities in processing longer sequences, with a sequence length of 4096 used during fine-tuning.

Training Details

The model was fine-tuned using the following key hyperparameters:

  • Dataset: Synthia v1.5-I (20.7k instruction-following examples)
  • Learning Rate: 1e-05
  • Epochs: 3
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • Framework: Transformers 4.45.0.dev0, Pytorch 2.3.1+cu121

Intended Uses

This model is well-suited for developers looking for a compact yet capable model for instruction-based tasks, text generation, and conversational AI, particularly where strong instruction adherence is critical.