Cloud Blog: From millions to billions: Announcing vector search in Memorystore for Valkey and Redis Cluster

Source URL: https://cloud.google.com/blog/products/databases/vector-search-for-memorystore-for-valkey-and-redis-cluster/
Source: Cloud Blog
Title: From millions to billions: Announcing vector search in Memorystore for Valkey and Redis Cluster

Feedly Summary: With the addition of vector search earlier this year, Memorystore for Redis emerged as an ideal platform for gen AI use cases such as Retrieval Augmented Generation (RAG), recommendation systems, semantic search, and more. Why? Because of its ultra-low latency vector search. Just a single Memorystore for Redis instance can perform vector search at single-digit millisecond latency over tens of millions of vectors. But what if you want to store more vectors than can fit into a single VM? 
Today, we’re excited to announce vector search on both the new Memorystore for Valkey and Memorystore for Redis Cluster, combining 1) ultra-low latency in-memory vector search, with 2) zero-downtime scalability (in or out), and 3) powerful high performance vector search across millions or billions of vectors. Currently in preview, vector support for these Memorystore offerings mean you can now scale out your cluster by scaling out to 250 shards, storing billions of vectors in a single instance. In fact, a single Memorystore for Redis Cluster instance can perform vector search at single-digit millisecond latency on over a billion vectors with greater than 99% recall! This scale enables demanding enterprise applications such as semantic search over a global corpus of data. 
Scalable in-memory vector search
The key to this performance and scalability is partitioning the vector index across the nodes in the cluster. Memorystore uses a local index partitioning strategy, meaning that each node contains a partition of the index that corresponds to the portion of the keyspace that is stored locally. Since the keyspace is already uniformly sharded using the OSS cluster protocol, each index partition is of roughly equal size.
Because of this design, adding nodes linearly improves index build times for all vector indices. Additionally, if the number of vectors is held constant, adding nodes improves Hierarchical Navigable Small World (HNSW) search performance logarithmically, and brute-force search performance improves linearly. Putting it all together, a single cluster can allow for a billion vectors to be indexable and searchable while maintaining fast index build times and low search latencies at high recall.
Hybrid queries
In addition to improved scalability, we are also excited to launch support for hybrid queries on Memorystore for Valkey and Memorystore for Redis Cluster. Hybrid queries let you combine vector searches with filters on numeric and tag fields. By combining numeric, tag, and vector search, you can use Memorystore to answer complex queries. 
Suppose you are an online clothing retailer and you want to provide recommendations for similar items. Using a vector index, you can find semantically similar items using embeddings and vector similarity search. But with just vector search, it is possible that you are surfacing some irrelevant results which should be filtered out. A user could be searching for a red dress, but there could be some items in your search that are a different article of clothing (e.g. red hats) or some that are much more expensive than the original item. 
To solve this problem with hybrid search, you can:
1. Use `FT.CREATE` to create a new vector index with additional fields for filtering:
`FT.CREATE inventory_index SCHEMA embedding VECTOR HNSW 6 DIM 128 TYPE FLOAT32 DISTANCE_METRIC L2 clothing_type TAG clothing_price_usd NUMERIC`
This creates an index `inventory_index` with:

A vector field `embedding` for the semantic embedding of the clothing item

A tag field `clothing_type` for the type of the article of clothing (e.g. “dress” or “hat”)

A numeric field `clothing_price_usd` for the price of the article of clothing

2. Use `FT.SEARCH` to perform a hybrid query on `inventory_index`. For example, we can query for 10 results while filtering to only articles of clothing of type “dress” and within the price range of $100 to $200:`FT.SEARCH inventory_index “(@clothing_type:{dress} @clothing_price_usd:[100-200])=>[KNN 10 @embedding $query_vector]“ PARAMS 2 query_vector “…” DIALECT 2`
These filter expressions also support boolean logic, meaning that multiple fields can be combined to fine tune the search results to only those that matter. With this new functionality, applications can tune vector search queries to their needs to get even richer results than before.
Standing behind OSS Valkey
In the open-source community, there’s a lot of enthusiasm for the Valkey key-value datastore. As part of our commitment to make Valkey amazing, we’ve coauthored an RFC (Request For Comments submission) and we’re working with the open source community to donate our vector search capabilities to Valkey. An RFC is the first step in driving alignment within the community and we welcome feedback on our proposal and implementation. Our primary goals are to enable Valkey developers around the world to leverage Valkey vector search to create amazing gen AI applications. 
The search is over for fast and scalable vector search
With today’s addition of fast and scalable vector search on Memorystore for Valkey and Memorystore for Redis Cluster, in addition to the existing functionality on Memorystore for Redis, Memorystore now offers ultra-low latency across all of its most popular engines. So when you’re building generative AI applications which require robust and consistent low-latency vector search, Memorystore will be hard to beat. Get started today by creating a Memorystore for Valkey or Memorystore for Redis Cluster instance to experience the speed of in-memory search.

AI Summary and Description: Yes

Summary: The text elaborates on the enhanced capabilities of Memorystore for Redis with the introduction of vector search, offering ultra-low latency and scalability for generative AI applications like Retrieval Augmented Generation, semantic search, and recommendation systems. It highlights new features like hybrid queries that combine vector searches with numeric and tag filters, making it easier for applications to deliver precise results.

Detailed Description:

The announcement discusses significant improvements in Memorystore for Redis and Valkey, particularly focusing on its application in generative AI and related use cases. Here are the major points:

– **Introduction of Vector Search**: Memorystore for Redis has added support for ultra-low latency vector searches, which can execute at single-digit millisecond latency even when processing millions or billions of vectors. This capability is pivotal for applications requiring rapid data retrieval.

– **Scalability Features**:
– The platform now supports zero-downtime scalability and can be expanded to 250 shards, allowing for the storage of billions of vectors in one instance.
– A single instance of Memorystore for Redis Cluster can perform searches over a billion vectors while maintaining high recall rates (greater than 99%).

– **Performance Scaling**:
– The architecture employs local index partitioning, which enhances both index build times and search performance as nodes are added. This method results in linear improvements to index building and logarithmic enhancements in search performance.

– **Hybrid Queries**:
– A new feature has been introduced that allows users to combine vector searches with filters based on numeric and tag fields, enhancing the capability of applications to handle complex queries.
– For example, a clothing retailer can filter for specific items while using vector search for recommendations, ensuring more relevant results are surfaced.

– **Open Source Community Engagement**:
– The text mentions collaboration with the open-source community around the Valkey key-value datastore, showing a commitment to enhancing its capabilities with vector search features.
– This collaboration includes coauthoring an RFC to invite feedback and drive the development of vector search functionalities.

– **Implications for Generative AI**:
– With these enhancements, Memorystore positions itself as a strong contender for generative AI applications that demand fast and scalable vector searches, allowing developers to build more efficient and performance-oriented solutions.

Overall, these advancements signify a robust integration of AI capabilities into cloud services, making it crucial for professionals in the fields of AI, cloud computing, and security to consider how these features impact application development, performance enhancement, and user experience.