Name: knowledgator/Qwen-encoder-1.5B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: knowledgator

Overview of Qwen-encoder-1.5B

The knowledgator/Qwen-encoder-1.5B model is a 1.5 billion parameter Qwen2-based bidirectional text encoder, developed by Knowledgator. It leverages the LLM2Vec recipe to transform a decoder-only Large Language Model into a robust text encoder. This process involves enabling bidirectional attention, masked next token prediction (trained on Wikipedia), and unsupervised contrastive learning.

Key Capabilities & Features

Bidirectional Encoding: Converts modern decoder-only LLMs into powerful bidirectional encoders, combining the strengths of both architectures.
Enhanced Context Understanding: Benefits from the large, diverse pre-training corpora and long-context window support of modern decoders.
Discriminative Task Adaptation: Easily adaptable for various discriminative NLP tasks, including text classification, question answering, and token classification.
Flash Attention Support: Utilizes Flash Attention for improved efficiency.
Full Weight Training: Unlike some LLM2Vec implementations, this model trains all weights of the base Qwen2 model, potentially enhancing its bidirectional capabilities.

Use Cases

This model is particularly well-suited for:

Text Classification: Categorizing documents or sentences.
Question Answering: Extracting answers from text passages.
Token Classification (e.g., NER): Identifying and classifying entities within text.
Feature Extraction: Generating high-quality embeddings for downstream NLP tasks.