Name: zstanjj/HTML-Pruner-Llama-1B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: zstanjj

Overview

The zstanjj/HTML-Pruner-Llama-1B is a 1 billion parameter model specifically developed for HTML pruning, a key component of the HtmlRAG system. This model is designed to process and condense HTML documents, making them more efficient for use with Retrieval Augmented Generation (RAG) systems, especially when dealing with long-context Large Language Models (LLMs).

Key Capabilities

Lossless HTML Cleaning: Removes irrelevant content and compresses redundant structures in HTML while preserving all semantic information. This prepares HTML for RAG systems with long-context LLMs.
Two-Step Block-Tree-Based HTML Pruning: Employs a two-stage pruning process based on a block tree structure. The first step uses an embedding model (e.g., BAAI/bge-large-en) to rank HTML blocks, and the second step refines this with a generative model (this Llama-1B model).
Context Optimization: Reduces the token count of HTML documents, allowing more relevant information to fit within the context window of LLMs.

Performance

When integrated into the HtmlRAG system with Llama-3.1-70B-Instruct as the chat model, HTML-Pruner-Llama-1B demonstrates competitive performance across various question-answering datasets, often outperforming traditional methods like BM25 and BGE on metrics such as Exact Match (EM) and ROUGE-L. For instance, it achieved 60.75 EM on NQ and 45.00 EM on HotpotQA, highlighting its effectiveness in improving RAG system accuracy by providing more focused HTML context.

Good For

Developers building RAG systems that utilize HTML as a knowledge source.
Applications requiring efficient processing of lengthy and complex HTML documents.
Optimizing input context for LLMs to improve retrieval and generation quality.

Overview

Overview

Key Capabilities

Performance

Good For

Full Model Card (README)