Retrievers
Last updated
Last updated
A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well. Retrievers accept a string query as input and return a list of Documents as output. retrievers help you search and retrieve information from your indexed documents.
The EnsembleRetriever improves retrieval performance by combining outcomes from multiple retrievers using the Reciprocal Rank Fusion algorithm. Typically, it combines a sparse retriever like BM25 with a dense retriever such as embedding similarity, capitalizing on their complementary strengths to create a hybrid search system. This integration enhances retrieval accuracy compared to using individual algorithms alone.
Params
retrievers – A list of retrievers to ensemble.
weights – A list of weights corresponding to the retrievers. Defaults to equal weighting for all retrievers.
c – A constant added to the rank, controlling the balance between the importance of high-ranked items and the consideration given to lower-ranked items.
To use the Ensemble Retriever Component, the parameters Top K(the number of results to return) and Weights are to be provided by the User as input. The component is then to be connected to the Documents(Chains, Loaders, Utilities) and Retriever(Vector Stores) Components.
Automating prompt tuning, the MultiQuery Retriever employs an LLM to generate diverse queries from various perspectives for a user input query. Retrieving relevant documents for each query, it then combines the unique union of all results to obtain a broader set of potentially relevant documents. By offering multiple viewpoints on the same inquiry, the MultiQuery Retriever mitigates the challenges of distance-based retrieval, potentially enriching the retrieved results.
question – the query provided by the user
llm- the LLM used for query generation
To use the MultiQuery Retriever, the component is to be to connected to an LLM, Prompt and Retriever Component
The VectorStoreRetriever is a retriever that utilizes vector similarity to retrieve documents based on an unstructured query. This retriever is typically backed by a vector store, where documents are embedded as vectors in high-dimensional space. Using similarity search, it finds documents most closely aligned with the input query, making it suitable for semantic search applications. This approach is particularly effective for queries where context or meaning matters, as it retrieves documents based on semantic similarity rather than exact keyword matching.
Params
Search Type – The type of similarity measure to use (e.g., similarity, distance).
Search Kwargs (k) – The number of top results to retrieve, where k
represents the number of closest matches to the query.