Name: IEITYuan/Yuan-embedding-2.0-en API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: IEITYuan

Overview

Yuan-embedding-2.0-en is an 0.8 billion parameter embedding model from IEITYuan, engineered for English text retrieval and reranking. It builds upon the foundation of Qwen/Qwen3-Embedding-0.6B and incorporates several key optimizations to enhance its performance in semantic search applications.

Key Capabilities

Optimized for English Text Retrieval: Specifically designed to generate high-quality embeddings for English text, facilitating accurate semantic search.
Enhanced for Reranking Tasks: Beyond initial retrieval, the model is also fine-tuned to improve the ranking of search results.
Advanced Data Augmentation:
- Hard Negative Sampling: Employs a dual evaluation process using a Rerank model and an LLM to filter high-quality positive and negative samples, improving model robustness.
- LLM-Synthesized Data: Leverages the Yuan2-M32 model to rewrite query data within the training dataset, expanding and diversifying the training examples.
Sophisticated Loss Function Design: Incorporates a multi-task loss function and Matryoshka Representation Learning. It uses InfoNCE with in-batch-negative for both retrieval and reranking tasks, which is crucial for learning effective representations.

Good For

Generating embeddings for English text.
Improving the accuracy of semantic search systems.
Enhancing the relevance and order of retrieved documents through reranking.
Applications requiring robust text similarity and contextual understanding in English.