knowledgator/Qwen-encoder-1.5B
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The knowledgator/Qwen-encoder-1.5B is a 1.5 billion parameter Qwen2-based bidirectional text encoder, adapted from a decoder-only LLM using the LLM2Vec recipe. It features a 131072 token context length and is trained with masked next token prediction on Wikipedia. This model is specifically designed to convert modern decoder-only LLMs into powerful text encoders, excelling in discriminative NLP tasks like text classification, question answering, and token classification.

Loading preview...

Overview of Qwen-encoder-1.5B

The knowledgator/Qwen-encoder-1.5B model is a 1.5 billion parameter Qwen2-based bidirectional text encoder, developed by Knowledgator. It leverages the LLM2Vec recipe to transform a decoder-only Large Language Model into a robust text encoder. This process involves enabling bidirectional attention, masked next token prediction (trained on Wikipedia), and unsupervised contrastive learning.

Key Capabilities & Features

  • Bidirectional Encoding: Converts modern decoder-only LLMs into powerful bidirectional encoders, combining the strengths of both architectures.
  • Enhanced Context Understanding: Benefits from the large, diverse pre-training corpora and long-context window support of modern decoders.
  • Discriminative Task Adaptation: Easily adaptable for various discriminative NLP tasks, including text classification, question answering, and token classification.
  • Flash Attention Support: Utilizes Flash Attention for improved efficiency.
  • Full Weight Training: Unlike some LLM2Vec implementations, this model trains all weights of the base Qwen2 model, potentially enhancing its bidirectional capabilities.

Use Cases

This model is particularly well-suited for:

  • Text Classification: Categorizing documents or sentences.
  • Question Answering: Extracting answers from text passages.
  • Token Classification (e.g., NER): Identifying and classifying entities within text.
  • Feature Extraction: Generating high-quality embeddings for downstream NLP tasks.