Cloud Blog: Build agentic RAG on Google Cloud databases with LlamaIndex

Dec 4, 2024

—

Source URL: https://cloud.google.com/blog/products/databases/llamaindex-integrates-with-alloydb-and-cloud-sql-for-postgresql/
Source: Cloud Blog
Title: Build agentic RAG on Google Cloud databases with LlamaIndex

Feedly Summary: AI agents are revolutionizing the landscape of gen AI application development. Retrieval augmented generation (RAG) has significantly enhanced the capabilities of large language models (LLMs), enabling them to access and leverage external data sources such as databases. This empowers LLMs to generate more informed and contextually relevant responses. Agentic RAG represents a significant leap forward, combining the power of information retrieval with advanced action planning capabilities. AI agents can execute complex tasks that involve multiple steps that reason, plan and make decisions, and then take actions to execute goals over multiple iterations. This opens up new possibilities for automating intricate workflows and processes, leading to increased efficiency and productivity.
LlamaIndex has emerged as a leading framework for building knowledge-driven and agentic systems. It offers a comprehensive suite of tools and functionality that facilitate the development of sophisticated AI agents. Notably, LlamaIndex provides both pre-built agent architectures that can be readily deployed for common use cases, as well as customizable workflows, which enable developers to tailor the behavior of AI agents to their specific requirements.
Today, we’re excited to announce a collaboration with LlamaIndex on open-source integrations for Google Cloud databases including AlloyDB for PostgreSQL and Cloud SQL for PostgreSQL.
These LlamaIndex integrations, available to download via PyPi llama-index-alloydb-pg and llama-index-cloud-sq-pg, empower developers to build agentic applications that can connect with Google databases. The integrations include:

Integrations

Description

Link to documentation on GitHub

LlamaIndex Vector Store

Stores vector embeddings of the content and retrieves semantically similar vectors to queries

AlloyDB , Cloud SQL for PostgreSQL

LlamaIndex Document Store

Stores the content related to the vectors in the vector store

AlloyDB , Cloud SQL for PostgreSQL

LlamaIndex Index Store

Stores metadata about the content in your document store

AlloyDB , Cloud SQL for PostgreSQL

In addition, developers can also access previously published LlamaIndex integrations for Firestore, including for Vector Store and Index Store.
Integration benefits
LlamaIndex supports a broad spectrum of different industry use cases, including agentic RAG, report generation, customer support, SQL agents, and productivity assistants. LlamaIndex’s multi-modal functionality extends to applications like retrieval-augmented image captioning, showcasing its versatility in integrating diverse data types. Through these use cases, joint customers of LlamaIndex and Google Cloud databases can expect to see an enhanced developer experience, complete with:

Streamlined knowledge retrieval: Using these packages makes it easier for developers to build knowledge-retrieval applications with Google databases. Developers can leverage AlloyDB and Cloud SQL vector stores to store and semantically search unstructured data to provide models with richer context. The LlamaIndex vector store integrations let you filter metadata effectively, select from vector similarity strategies, and help improve performance with custom vector indexes.

Complex document parsing: LlamaIndex’s first-class document parser, LlamaParse, converts complex document formats with images, charts and rich tables into a form more easily understood by LLMs; this produces demonstrably better results for LLMs attempting to understand the content of these documents.

Secure authentication and authorization: LlamaIndex integrations to Google databases utilize the principle of least privilege, a best practice, when creating database connection pools, authenticating, and authorizing access to database instances.

Fast prototyping: Developers can quickly build and set up agentic systems with readily available pre-built agent and tool architectures on LlamaHub.

Flow control: For production use cases, LlamaIndex Workflows provide the flexibility to build and deploy complex agentic systems with granular control of conditional execution, as well as powerful state management.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>

A report generation use case
Agentic RAG workflows are moving beyond simple question and answer chatbots. Agents can synthesize information from across sources and knowledge bases to generate in-depth reports. Report generation spans across many industries — from legal, where agents can do prework such as research, to financial services, where agents can analyze earning call reports. Agents mimic experts that sift through information to generate insights. And even if agent reasoning and retrieval takes several minutes, automating these reports can save teams several hours.
LlamaIndex provides all the key components to generate reports:

Structured output definitions with the ability to organize outputs into Report templates

Intelligent document parsing to easily extract and chunk text and other media

Knowledge base storage and integration across the customer’s ecosystem

Agentic workflows to define tasks and guide agent reasoning

Now let’s see how these concepts work, and consider how to build a report generation agent that provides daily updates on new research papers about LLMs and RAG.
1. Prepare data: Load and parse documents
The key to any RAG workflow is ensuring a well-created knowledge base. Before you store the data, you need to ensure it is clean and useful. Data for the knowledge bases can come from your enterprise data or other sources. To generate reports for top research articles, developers can use the Arxiv SDK to pull free, open-access publications.

code_block
<ListValue: [StructValue([(‘code’, ‘import arxiv\r\n\r\n\r\nclient = arxiv.Client()\r\nsearch = arxiv.Search(\r\n query = “RAG",\r\n max_results = 5,\r\n sort_by = arxiv.SortCriterion.SubmittedDate\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a47147c0>)])]>

But rather than use the ArxivReader to load and convert articles to plain text, LlamaParse supports varying paper formats, tables, and multimodal media leading to improved accuracy of document parsing.

code_block
<ListValue: [StructValue([(‘code’, ‘parser = LlamaParse(\r\n api_key="llx-…",\r\n result_type="markdown", \r\n num_workers=2,\r\n)\r\n\r\ndocument = parser.load_data(pdf_file)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714e50>)])]>

To improve the knowledge base’s effectiveness, we recommend adding metadata to documents. This allows for advanced filtering or support for additional tooling. Learn more about metadata extraction.
2. Create a knowledge base: storage data for retrieval
Now, the data needs to be saved for long-term use. The LlamaIndexGoogle Cloud database integrations support storage and retrieval of a growing knowledge base.
2.1. Create a secure connection to the AlloyDB or Cloud SQL database
Utilize the AlloyDBEngine class to easily create a shareable connection pool that securely connects to your PostgreSQL instance.

code_block
<ListValue: [StructValue([(‘code’, ‘from llama_index_alloydb_pg import AlloyDBEngine\r\n\r\nengine = await AlloyDBEngine.afrom_instance(\r\n project_id=PROJECT_ID,\r\n region=REGION,\r\n cluster=CLUSTER,\r\n instance=INSTANCE,\r\n database=DATABASE,\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714fa0>)])]>

Create only the necessary tables needed for your knowledge base. Creating separate tables reduces the level of access permissions that your agent needs. You can also specify a special “publication_date” metadata column that you can filter on later.

code_block
<ListValue: [StructValue([(‘code’, ‘await engine.ainit_doc_store_table(\r\n table_name="llama_doc_store",\r\n vector_size=768\r\n)\r\n\r\nawait engine.ainit_index_store_table(\r\n table_name="llama_index_store",\r\n vector_size=768\r\n)\r\n\r\nawait engine.ainit_vector_store_table(\r\n table_name="llama_vector_store",\r\n vector_size=768,\r\n metadata_columns=[Column("publication_date", "DATE")],\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a47149a0>)])]>

Optional: Set up a Google Cloud embedding model. The knowledge base utilizes vector embeddings to search for semantically similar text.

code_block
<ListValue: [StructValue([(‘code’, ‘import google.auth\r\n\r\nfrom llama_index.core import Settings\r\nfrom llama_index.embeddings.vertex import VertexTextEmbedding\r\n\r\n\r\ncredentials, project_id = google.auth.default()\r\nSettings.embed_model = VertexTextEmbedding(\r\n model_name="textembedding-gecko@003",\r\n project=PROJECT_ID,\r\n credentials=credentials\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a47149d0>)])]>

2.2. Customize the underlying storage with the Document Store, Index Store, and Vector Store. For the vector store, specify the metadata field "publication_date" that you created previously.

code_block
<ListValue: [StructValue([(‘code’, ‘from llama_index.core import StorageContext\r\nfrom llama_index_alloydb_pg import AlloyDBVectorStore, AlloyDBDocumentStore, AlloyDBIndexStore\r\n\r\nvector_store = await AlloyDBVectorStore.create(\r\n engine=engine,\r\n table_name="llama_vector_store",\r\n metadata_columns=["publication_date"],\r\n)\r\n\r\ndoc_store = await AlloyDBDocumentStore.create(\r\n engine=engine,\r\n table_name="llama_doc_store",\r\n)\r\nindex_store = await AlloyDBIndexStore.create(\r\n engine=engine,\r\n table_name="llama_index_store",\r\n)\r\n\r\nstorage_context = StorageContext.from_defaults(\r\n docstore=docstore, \r\n index_store=index_store,\r\n vector_store=vector_store\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714820>)])]>

2.3 Add the parsed documents to the knowledge base and build a Vector Store Index.

code_block
<ListValue: [StructValue([(‘code’, ‘from llama_index.core import VectorStoreIndex,\r\n\r\n\r\nindex = VectorStoreIndex.from_documents(\r\n documents, storage_context=storage_context, show_progress=True\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714b80>)])]>

You can use other LlamaIndex index types like a Summary Index as additional tools to query and combine data.

code_block
<ListValue: [StructValue([(‘code’, ‘from llama_index.core import SummaryIndex\r\n\r\nsummary_index = SummaryIndex.from_documents(documents, storage_context=storage_context)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714640>)])]>

2.4. Create tools from indexes to be used by the agent.

code_block
<ListValue: [StructValue([(‘code’, ‘search_tool = QueryEngineTool.from_defaults(\r\n query_engine=index.as_query_engine(),\r\n description="Useful for retrieving specific snippets from research publications.",\r\n)\r\n\r\nsummary_tool = = QueryEngineTool.from_defaults(\r\n query_engine=summary_tool.as_query_engine(),\r\n description="Useful for questions asking questions about research publications.",\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714e80>)])]>

3. Prompt: create an outline for the report
Reports may have requirements on sections and formatting. The agent needs instructions for formatting. Here is an example outline of a report format:

code_block
<ListValue: [StructValue([(‘code’, ‘outline="""\r\n# DATE Daily report: TOPIC\r\n\r\n## Executive Summary\r\n\r\n## Top Challenges / Description of problems\r\n\r\n## Summary of papers\r\n\r\n| Title | Authors | Summary | Links |\r\n| —– | ——- | ——- | —– |\r\n|LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data | Liana Patel, Siddharth Jha, Carlos Guestrin, Matei Zaharia | … | https://arxiv.org/abs/2407.11418v1 |\r\n"""’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a47143a0>)])]>

4. Define the workflow: outline agentic steps
Next, you define the workflow to guide the agent’s actions. For this example workflow, the agent tries to reason what tool to call: summary tools or the vector search tool. Once the agent has reasoned it doesn’t need additional data, it can exit out of the research loop to generate a report.

LlamaIndex Workflows provides an easy to use SDK to build any type of workflow:

code_block
<ListValue: [StructValue([(‘code’, ‘from llama_index.core.workflow import Workflow, StartEvent, StopEvent, Context, step\r\nfrom llama_index.llms.vertex import Vertex\r\n\r\n\r\nclass QueryEvent(Event):\r\n question: str\r\n\r\nclass SummaryEvent(Event):\r\n tool_call: ToolSelection\r\n\r\nclass SearchEvent(Event):\r\n tool_call: ToolSelection\r\n\r\nclass ReportGenerationEvent(Event):\r\n pass\r\n\r\n\r\nclass ReportGenerationAgent(Workflow):\r\n """Report generation agent."""\r\n\r\n def __init__(\r\n self,\r\n search_tool: BaseTool,\r\n summary_tool: BaseTool,\r\n llm: FunctionCallingLLM | None = None,\r\n outline: str,\r\n **kwargs: Any,\r\n ) -> None:\r\n super().__init__(**kwargs)\r\n self.search_tool = search_tool\r\n self.summary_tool = summary_tool\r\n self.llm = llm \r\n self.outline = outline\r\n self.memory = ChatMemoryBuffer.from_defaults(llm=llm)\r\n\r\n @step\r\n async def query(self, ctx: Context, ev: StartEvent) -> QueryEvent:\r\n ctx.data["contents"] = []\r\n ctx.data["query"] = ev.query\r\n self.memory.put(ev.query)\r\n return QueryEvent(chat_history=self.memory.get())\r\n \r\n @step(pass_context=True)\r\n async def router(\r\n self, ctx: Context, ev: QueryEvent\r\n ) -> SummaryEvent | SearchEvent | ReportGenerationEvent | StopEvent:\r\n chat_history = ev.chat_history\r\n\r\n response = await self.llm.achat_with_tools(\r\n [self.search_tool, self.summary_tool],\r\n chat_history=chat_history,\r\n )\r\n\r\n if ….:\r\n return ReportGenerationEvent()\r\n\r\n if …:\r\n return SummaryEvent()\r\n elif …:\r\n return SearchEvent()\r\n else:\r\n return StopEvent(result={"response": "Invalid tool."})\r\n\r\n @step(pass_context=True)\r\n async def handle_retrieval(\r\n self, ctx: Context, ev: SummaryEvent | SearchEvent\r\n ) -> QueryEvent:\r\n\t if ….:\r\n return self.summary_tool(query)\r\n\r\n if …:\r\n return self.search_tool(query)\r\n\r\n return QueryEvent(chat_history=self.memory.get())\r\n \r\n\r\n def format_report(contents):\r\n\t"""Format report utility helper"""\r\n …\r\n return report \r\n \r\n @step(pass_context=True)\r\n async def generate_report(\r\n self, ctx: Context, ev: ReportGenerationEvent\r\n ) -> StopEvent:\r\n """Generate report."""\r\n report = self.format_report(ctx.data["contents"])\r\n return StopEvent(result={"response": report})\r\n\r\n\r\nagent = ReportGenerationAgent(\r\n search_tool=search_tool,\r\n summary_tool=summary_tool,\r\n llm=Vertex(model="gemini-pro"),\r\n outline=outline\r\n)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714220>)])]>

5. Generate reports: run the agent
Now that you’ve set up a knowledge base and defined an agent, you can set up automation to generate a report!

code_block
<ListValue: [StructValue([(‘code’, ‘query = "What are the recently published RAG techniques"\r\nreport = await agent.run(query=query)\r\n\r\n# Save the report\r\nwith open("report.md", "w") as f:\r\n f.write(report[\’response\’])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e79a4714b50>)])]>

There you have it! A complete report that summarizes recent research in LLM and RAG techniques. How easy was that?
Get started today
In short, these LlamaIndex integrations with Google Cloud databases enables application developers to leverage the data in their operational databases to easily build complex agentic RAG workflows. This collaboration supports Google Cloud’s long-term commitment to be an open, integrated, and innovative database platform. With LlamaIndex’s extensive user base, this integration further expands the possibilities for developers to create cutting-edge, knowledge-driven AI agents.
Ready to get started? Take a look at the following Notebook-based tutorials:

AlloyDB

llama_index_vector_store.ipynb

llama_index_doc_store.ipynb

Cloud SQL for PostgreSQL

llama_index_vector_store.ipynb

llama_index_doc_store.ipynb

Find all information on GitHub at github.com/googleapis/llama-index-cloud-sql-pg-python and github.com/googleapis/llama-index-alloydb-pg-python.

AI Summary and Description: Yes

Summary: The text discusses advancements in AI agents and Retrieval Augmented Generation (RAG) technologies, particularly the integration of LlamaIndex with Google Cloud databases. It emphasizes how this collaboration allows developers to create sophisticated, knowledge-driven AI applications, enhancing workflows and automating report generation processes.

Detailed Description:

The content focuses on the transformative role of AI agents in the development of generative AI applications, specifically through the use of Retrieval Augmented Generation (RAG). The following points summarize the key insights and developments:

– **Advancement in AI Agents**:
– AI agents are enhancing generative AI applications by using RAG techniques.
– By combining internal LLM capabilities with external data access, these agents can provide richer and more contextually aware responses.

– **LlamaIndex Framework**:
– LlamaIndex has become a prominent framework for constructing intelligent AI agents.
– It serves as a toolkit that allows developers to quickly create and customize workflows for applications requiring complex decision-making and task execution.

– **Integration with Google Cloud**:
– Collaboration with Google Cloud introduces specific integrations for databases like AlloyDB and Cloud SQL, facilitating the development of data-driven applications.
– These integrations enable the storage and retrieval of various forms of data (vector, document, and index storage) to improve knowledge management within AI systems.

– **Use Cases**:
– LlamaIndex supports numerous applications, including:
– Report generation for various industries (e.g., legal and financial service).
– Customer support and productivity tools.
– Advanced capabilities for parsing and organizing complex documents.

– **Integration Benefits**:
– The integrations promote:
– Streamlined knowledge retrieval from databases for dynamic application building.
– Secure authentication practices through the principle of least privilege when connecting to databases.
– Complex document parsing capabilities to improve the accuracy of information captured.

– **Report Generation Agent**:
– The text illustrates the process of creating a report generation agent that automates the aggregation of research content into coherent reports, showcasing practical implementations of agent workflows.

– **Efficient Prototyping & Flexibility**:
– With pre-built agent frameworks and customizable workflows, developers can swiftly prototype and manage sophisticated systems with a high degree of flexibility in controlling workflow execution.

– **Key Programming Elements**:
– The document provides example code snippets to illustrate the integration of the LlamaIndex framework with Google Cloud for various functionalities such as establishing database connections and defining workflows for agent actions.

The publication overall indicates a significant potential for automated workflows and AI application development by leveraging cutting-edge frameworks and cloud technologies, particularly beneficial for professionals in AI, cloud computing, and information security domains.

1 2 4 a access accuracy Act advanced capabilities advancement advancements agent agent framework agent frameworks agentic workflows agents AGI AI AI applications Alloy AlloyDB anti API APIs Application application developers application development applications Arch architecture Aria Arize art as assistant assistants async augmented generation authentication authentication practices authorization authors Auto automated workflows automation Behavior bots by C capabilities challenges chat Chatbot Chatbots Cloud cloud computing Cloud SQL cloud technologies code collaboration Computing Console Context control credentials cross Customer customer support customizable cutting D data data access data extraction data sources data-driven database database instances databases day decision decision-making DeFi definition definitions demo depth developer developer experience developers development documentation driven driven applications e ecosystem edge effectiveness efficiency efficient election embedding model embeddings end enterprise ERP event execution exp External external data sources extraction fast fault filtering financial financial services fine Firestore first flexibility for framework functionality g Gemini Gen generation generative Generative AI git GitHub Go Google Google Cloud gs high http HTTPS image image captioning implementation in industry industry use cases information information retrieval information security insights integration integrations Intel inter intern IRS ite k knowledge knowledge base knowledge bases knowledge management knowledge retrieval l Labor language language model language models large large language model large language models least privilege led Legal Link llama LlamaIndex llm llms lm long loop low making management markdown matt media memory Meta metadata Mila mission ML modal model models multi Multimodal no non notebook o oE of on one open open-source operation ory Outputs over parsing pdf performance permissions phi planning post Postgre Postgres PostgreSQL Power pre Principle of Least Privilege production productivity productivity tool productivity tools products professionals programming Progress prompt prototyping public Py pypi Python question QUIC RCE reasoning Region Report Generation Requirements research research papers response restore retrieval Role s search sec secure Secure Authentication secure connection security self service services settings side Sig Sim Simple source sql SSE state state management storage structured structured data structured output system systems T Task task execution tasks techniques technologies text the to tooling toolkit tools Tor trial trie unstructured data up update updates use cases user uth Vector Embeddings vector search vector store vectors Vertex Well Wi workers workflows x