A retriever that uses two sets of embeddings to perform adaptive retrieval. Based off of the "Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval" blog post https://supabase.com/blog/matryoshka-embeddings.

This class performs "Adaptive Retrieval" for searching text embeddings efficiently using the Matryoshka Representation Learning (MRL) technique. It retrieves documents similar to a query embedding in two steps:

First-pass: Uses a lower dimensional sub-vector from the MRL embedding for an initial, fast, but less accurate search.

Second-pass: Re-ranks the top results from the first pass using the full, high-dimensional embedding for higher accuracy.

This code implements MRL embeddings for efficient vector search by combining faster, lower-dimensional initial search with accurate, high-dimensional re-ranking.

Type Parameters

Hierarchy

  • Toolkit<Store>
    • MatryoshkaRetriever

Constructors

Properties

largeEmbeddingKey: string = "lc_large_embedding"
largeEmbeddingModel: Embeddings
largeK: number = 8
searchType: "cosine" | "innerProduct" | "euclidean" = "cosine"
smallK: number = 50

Methods

  • Override the default addDocuments method to embed the documents twice, once using the larger embeddings model, and then again using the default embedding model linked to the vector store.

    Parameters

    • documents: DocumentInterface[]

      An array of documents to add to the vector store.

    • Optional options: AddDocumentOptions

      An optional object containing additional options for adding documents.

    Returns Promise<void | string[]>

    A promise that resolves to an array of the document IDs that were added to the vector store.

Generated using TypeDoc