Cloud Blog: Test it out: an online shopping demo experience with Gemini and RAG

Source URL: https://cloud.google.com/blog/products/ai-machine-learning/an-online-shopping-demo-with-gemini-and-rag/
Source: Cloud Blog
Title: Test it out: an online shopping demo experience with Gemini and RAG

Feedly Summary: Earlier this year, tens of thousands of developers gathered in Las Vegas for Google Cloud Next ’24, which culminated in hundreds of sessions and over 200 announcements. During the Developer Keynote, we showcased how Gemini can help with the shopping experience of an online store. Let’s dive into this demo and how it all worked in more detail!
Problem statement
Imagine you are the owner of Cymbal Shops, an online store with thousands of products. Each item has some metadata– a title, a description, a cost, and more. Traditional built-in search engines can easily handle queries relating to this metadata, but what if the shoppers are more interested in abstract questions such as “what piece of clothing pairs well with my outfit?", or "what kind of furniture would complement my living room?"

This kind of user need sounds like a great use-case for integration with a large language model (LLM) that can process both text and images, such as Gemini!
But "wait a minute…", I hear you say. An LLM is trained on an extensive amount of datasets that are not necessarily the collection of products that Cymbal Shops sells, and you quickly notice that the results are fairly generic and irrelevant:

This is where Retrieval-Augmented Generation (also known as RAG) comes in.
Retrieval-Augmented Generation (RAG)
RAG improves the accuracy and relevance of LLM outputs by augmenting (adding onto) the input prompt with relevant pieces of data stored in an external database or search index. This can involve for example embedding the prompt and documents into a shared vector space, then using similarity metrics to rank the pieces of data by relevance.
When a new user prompt is received (in our case, when the user talks to the shopping assistant), we embed it in the same format as our online store’s inventory (a series of numerical values) which allows us to retrieve the most likely relevant items, which we then include as part of the original prompt to the LLM.
In this scenario, Cymbal Shops was already using a PostgreSQL database to store product information, so we will leverage AlloyDB, Google Cloud’s 100% PostgreSQL-compatible database, to also store vectorized products. Leveraging Vertex AI’s built-in RAG capabilities may be preferable in other scenarios.

Without RAG, the prompt sent to Gemini might have looked like this:

code_block
.\r\n\r\nHere is the user’s request: <user request>."), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6d7c507460>)])]>

With RAG, the prompt generated now looks like this:

code_block
<ListValue: [StructValue([(‘code’, "You are an interior designer that works for Cymbal Shops. Find the most relevant items that matches the style of the room described here: <room description>.\r\n\r\nHere is the user’s request: <user request>.\r\n\r\nOnly pick items that exists as part of this list, and return your top 3 picks: <item list>."), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6d7c507820>)])]>

Notice the difference?
Let’s deploy this new version of Cymbal Shops and try the same user prompt ("Suggest furniture pieces that pairs well with this room") alongside the same living room space photo:

The shopping assistant now only recommends real products from the Cymbal Shops inventory!
What’s next?
Try out the demo and see how Gemini can improve your customer experience for yourself! The Cymbal Shops sample application is open source, where you can find the instructions to deploy this demo.
Learn more about how you can leverage the power of RAG in Google Cloud:

What is Retrieval-Augmented Generation (RAG)?

Infrastructure for a RAG-capable generative AI application using Vertex AI

Build enterprise gen AI apps with Google Cloud databases

AI Summary and Description: Yes

Summary: The text presents a demo from Google Cloud Next ’24, showcasing how large language models (LLMs), specifically Gemini, utilize Retrieval-Augmented Generation (RAG) to improve the online shopping experience through enhanced product recommendation systems. The integration addresses the challenges of generic LLM outputs by refining search outputs with relevant product data.

Detailed Description: The content elaborates on innovative AI applications in retail, focusing on how LLMs can enhance user interactions in e-commerce settings. This has significant implications for professionals interested in AI and cloud computing security, particularly concerning data management and the incorporation of generative models into business infrastructures.

– **Event Context**:
– Overview of Google Cloud Next ’24 with numerous developer sessions and announcements.

– **Use Case**:
– **Cymbal Shops**: This online retail scenario exemplifies the application of AI to improve customer engagement and satisfaction.
– Traditional search engines struggle with abstract user queries, hence the need for sophisticated AI models.

– **Challenges with LLMs**:
– LLMs like Gemini can produce generic responses due to being trained on diverse datasets not specifically tailored to Cymbal Shops’ inventory.

– **Introduction of RAG**:
– **Retrieval-Augmented Generation (RAG)**: A technique that enhances the contextuality and accuracy of LLM outputs. It improves relevance by integrating external data (such as product inventory) into the LLM’s prompt processing.
– Discusses embedding prompts and documents into a shared vector space to rank data by relevance.

– **Practical Implementation**:
– Utilizes AlloyDB—a PostgreSQL-compatible database—to manage product information alongside vectorization methods for the inventory.
– Demonstrates improved querying by changing how prompts are constructed with specific item lists.

– **Future Applications**:
– Encouragement to try the hands-on demo to experience RAG’s impact on customer experience firsthand.
– Various resources are provided for developers to explore RAG implementation on Google Cloud.

The significance of this narrative lies in its practical application of LLMs for real-world scenarios, demonstrating how enhanced data handling can lead to improved customer service while ensuring accuracy and performance in AI-driven environments. This integration is crucial for security professionals as it raises considerations about data privacy, security of customer information, and regulatory compliance in handling AI-generated outputs in business contexts.