Vector search settings can be configured both server-side and at runtime. Runtime settings are passed as a dictionary to the search and RAG endpoints. You may refer to the search API documentation here for additional materials.

Example using the Python SDK:

vector_search_settings = {
    "use_vector_search": True,
    "filters": {"document_type": "article"},
    "search_limit": 20,
    "use_hybrid_search": True,
    "selected_collection_ids": ["c3291abf-8a4e-5d9d-80fd-232ef6fd8526"]
}

response = client.search("query", vector_search_settings=vector_search_settings)

Configurable Parameters

VectorSearchSettings

  1. use_vector_search (bool): Whether to use vector search
  2. use_hybrid_search (bool): Whether to perform a hybrid search (combining vector and keyword search)
  3. filters (dict): Filters to apply to the vector search
  4. search_limit (int): Maximum number of results to return (1-1000)
  5. selected_collection_ids (list[UUID]): Group IDs to search for
  6. index_measure (IndexMeasure): The distance measure to use for indexing (cosine_distance, l2_distance, or max_inner_product)
  7. include_values (bool): Whether to include search score values in the search results
  8. include_metadatas (bool): Whether to include element metadata in the search results
  9. probes (Optional[int]): Number of ivfflat index lists to query (default: 10)
  10. ef_search (Optional[int]): Size of the dynamic candidate list for HNSW index search (default: 40)
  11. hybrid_search_settings (Optional[HybridSearchSettings]): Settings for hybrid search

HybridSearchSettings

  1. full_text_weight (float): Weight to apply to full text search (default: 1.0)
  2. semantic_weight (float): Weight to apply to semantic search (default: 5.0)
  3. full_text_limit (int): Maximum number of results to return from full text search (default: 200)
  4. rrf_k (int): K-value for RRF (Rank Reciprocal Fusion) (default: 50)

Advanced Filtering

R2R supports complex filtering using PostgreSQL-based queries. Allowed operators include:

  • eq, neq: Equality and inequality
  • gt, gte, lt, lte: Greater than, greater than or equal, less than, less than or equal
  • like, ilike: Pattern matching (case-sensitive and case-insensitive)
  • in, nin: Inclusion and exclusion in a list of values

Example of advanced filtering:

filters = {
    "$and": [
        {"publication_date": {"$gte": "2023-01-01"}},
        {"author": {"$in": ["John Doe", "Jane Smith"]}},
        {"category": {"$ilike": "%technology%"}}
    ]
}
vector_search_settings["filters"] = filters