Faiss Vector Store
The Faiss Vector Store node is a component that utilizes the Faiss (Facebook AI Similarity Search) library for efficient similarity search and clustering of dense vectors. It’s designed to work with embedded data, allowing for quick retrieval of similar documents based on vector representations.
Node Details
-
Name: Faiss
-
Type: Faiss
-
Version: 1.0
-
Category: Vector Stores
Base Classes
-
Faiss
-
VectorStoreRetriever
-
BaseRetriever
Input Parameters
-
Document (optional, list)
-
Type: Document
-
Description: List of documents to be stored in the vector store
-
-
Embeddings
-
Type: Embeddings
-
Description: Embedding model used to convert documents into vector representations
-
-
Base Path to load
-
Type: string
-
Description: Path to load or save the faiss.index file
-
Placeholder:
C:\Users\User\Desktop
-
-
Top K (optional, additional parameter)
-
Type: number
-
Description: Number of top results to fetch (default: 4)
-
Placeholder: 4
-
Outputs
-
Faiss Retriever
-
Name: retriever
-
Base Classes: [Faiss, VectorStoreRetriever, BaseRetriever]
-
-
Faiss Vector Store
-
Name: vectorStore
-
Base Classes: [Faiss, …FaissStore base classes]
-
Functionality
The Faiss Vector Store node provides two main functions:
-
Upsert:
-
Processes input documents
-
Creates vector embeddings using the provided embedding model
-
Stores the vectors in a Faiss index
-
Saves the index to the specified base path
-
-
Init:
-
Loads an existing Faiss index from the specified base path
-
Creates either a retriever or a vector store based on the output selection
-
Configures the similarity search function to avoid illegal invocation errors
-
Use Cases
-
Efficient similarity search in large document collections
-
Building retrieval-augmented generation (RAG) systems
-
Creating semantic search engines
-
Implementing recommendation systems based on content similarity
Notes
-
The node includes a custom implementation of
similaritySearchVectorWithScore
to handle potential issues with the number of requested results exceeding the total number of stored vectors. -
It’s designed to work seamlessly within a larger system, likely a node-based workflow for natural language processing tasks.