Embeddings Filter Retriever
The Embeddings Filter Retriever is a specialized retriever that uses embeddings to filter out documents unrelated to a given query. It’s designed to improve the relevance of retrieved documents by comparing their embeddings to the query embedding.
Node Details
-
Name: embeddingsFilterRetriever
-
Type: EmbeddingsFilterRetriever
-
Version: 1.0
-
Category: Retrievers
-
Base Classes: EmbeddingsFilterRetriever, BaseRetriever
Description
This node implements a document compressor that uses embeddings to drop documents unrelated to the query. It combines a base retriever (typically a vector store retriever) with an embeddings filter to refine the retrieval process.
Input Parameters
-
Vector Store Retriever (baseRetriever)
-
Type: VectorStoreRetriever
-
Description: The base retriever to use for initial document retrieval.
-
-
Embeddings (embeddings)
-
Type: Embeddings
-
Description: The embeddings model to use for encoding queries and documents.
-
-
Query (query)
-
Type: string
-
Optional: Yes
-
Description: Specific query to retrieve documents. If not provided, the user’s question will be used.
-
-
Similarity Threshold (similarityThreshold)
-
Type: number
-
Default: 0.8
-
Optional: Yes
-
Description: Threshold for determining when two documents are similar enough to be considered redundant.
-
-
K (k)
-
Type: number
-
Default: 20
-
Optional: Yes
-
Description: The number of relevant documents to return. Can be set to undefined, in which case similarity_threshold must be specified.
-
Outputs
-
Embeddings Filter Retriever (retriever)
-
Type: EmbeddingsFilterRetriever, BaseRetriever
-
Description: The configured retriever object.
-
-
Document (document)
-
Type: Document, json
-
Description: Array of document objects containing metadata and pageContent.
-
-
Text (text)
-
Type: string, json
-
Description: Concatenated string from pageContent of retrieved documents.
-
Functionality
The Embeddings Filter Retriever works by:
-
Using the base retriever to fetch an initial set of documents.
-
Applying an embeddings filter to refine the results based on similarity to the query.
-
Returning either the retriever object, the filtered documents, or the concatenated text of the documents based on the specified output.
Use Cases
-
Improving relevance in document retrieval tasks.
-
Reducing noise in retrieved documents for more focused language model inputs.
-
Enhancing question-answering systems by providing more relevant context.
Notes
-
Either ‘k’ or ‘similarity_threshold’ must be specified for proper functioning.
-
The node uses the ContextualCompressionRetriever and EmbeddingsFilter from the LangChain library.
-
It handles escape characters in the output text when returning concatenated document content.
Was this page helpful?