The knowledgator/Qwen-encoder-1.5B is a 1.5 billion parameter Qwen2-based bidirectional text encoder, adapted from a decoder-only LLM using the LLM2Vec recipe. It features a 131072 token context length and is trained with masked next token prediction on Wikipedia. This model is specifically designed to convert modern decoder-only LLMs into powerful text encoders, excelling in discriminative NLP tasks like text classification, question answering, and token classification.
Loading preview...
Overview of Qwen-encoder-1.5B
The knowledgator/Qwen-encoder-1.5B model is a 1.5 billion parameter Qwen2-based bidirectional text encoder, developed by Knowledgator. It leverages the LLM2Vec recipe to transform a decoder-only Large Language Model into a robust text encoder. This process involves enabling bidirectional attention, masked next token prediction (trained on Wikipedia), and unsupervised contrastive learning.
Key Capabilities & Features
- Bidirectional Encoding: Converts modern decoder-only LLMs into powerful bidirectional encoders, combining the strengths of both architectures.
- Enhanced Context Understanding: Benefits from the large, diverse pre-training corpora and long-context window support of modern decoders.
- Discriminative Task Adaptation: Easily adaptable for various discriminative NLP tasks, including text classification, question answering, and token classification.
- Flash Attention Support: Utilizes Flash Attention for improved efficiency.
- Full Weight Training: Unlike some LLM2Vec implementations, this model trains all weights of the base Qwen2 model, potentially enhancing its bidirectional capabilities.
Use Cases
This model is particularly well-suited for:
- Text Classification: Categorizing documents or sentences.
- Question Answering: Extracting answers from text passages.
- Token Classification (e.g., NER): Identifying and classifying entities within text.
- Feature Extraction: Generating high-quality embeddings for downstream NLP tasks.