Categories: FAANG

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

Leveraging enterprise data for generative AI and large language models (LLMs) presents significant challenges related to data silos, quality inconsistencies, privacy and security concerns, compliance with data regulations, capturing domain-specific knowledge, and mitigating inherent biases. Organizations must navigate the complexities of consolidating fragmented data sources, ensuring data integrity, and addressing ethical considerations.

Techniques like retrieval augmented generation (RAG) can help bridge the gap between enterprise generative AI apps and the actual enterprise data. Although RAG is a great tool, and LLMs have enabled natural language-to-SQL translation, these capabilities fall short in situations where enterprise data is scattered in a complex, heterogeneous data landscape. It’s straightforward to extend your chatbot to query one database, but how would one deal with a complex system with an EDW, data lake, several applications on prem and SaaS? Meanwhile, how would one also ensure security is consistent across that ecosystem and bring forward governance, lineage, documentation, or data quality?

A combination of Denodo’s data virtualization and Google’s Vertex AI technologies can address these challenges and opportunities. While Denodo enables the creation of a unified, virtual view of data from disparate sources, providing a single access point, Google’s Vertex AI embeddings, foundation models, and vector search capabilities with LangChain help to build generative AI applications that can then intelligently retrieve, synthesize, and process relevant information from the virtualized data layer.

Fig 1- Platform Architecture- Denodo with Vertex AI

Fig 2-AI Agent enabled by RAG on Vertex AI with Vector Search, Gemini Pro and Denodo

In the example above, a mortgage processing company integrates Denodo’s data virtualization with RAG models on Vertex AI to empower loan officers with generative AI and large language models to efficiently handle complex queries and tasks. The data virtualization layer unifies fragmented data sources like the EDW, CRM systems, loan origination software, and compliance documents, ensuring the RAG model has access to comprehensive, up-to-date information. When a loan officer submits a natural language query, the retrieval component fetches relevant data from virtualized sources, such as eligibility criteria or regulatory guidelines, which the language model then processes to generate detailed, contextual responses tailored to the customer’s needs. This approach streamlines intricate processes like pre-qualifying customers, preparing loan packages, and addressing underwriter requests, enabling loan officers to provide accurate, compliant, and personalized service while leveraging the power of generative AI capabilities.

The Denodo Platform, leverages data virtualization technology, eliminating the need for data movement or consolidation before augmenting an AI application. It provides a single, consolidated gateway for AI applications to access integrated data and offers a number of other key benefits, including: 

  • A unified, secure access point for LLMs to interact with and query all enterprise data (ERP, Operational Data Mart, EDW, Application APIs).

  • A rich semantic layer, providing LLMs with the needed business context and knowledge (such as table descriptions, business definitions, categories/tags, and sample values).

  • Quick delivery of logical data views that are de-coupled and abstracted from the underlying technical data views (which can be difficult to use by LLMs). 

  • Delivery of LLM-friendly wide logical table views without needing to replicate and combine multiple datasets first physically.

  • Built-in query optimization that relieves LLMs from dealing with specific data source constraints or optimized join strategies.

  • Data lineage and other governance tools that can surface additional elements like data provenance, data quality markers, endorsements and warnings, in addition to the natural language response to a query.

As enterprises join the generative AI revolution, data management and virtualization will play a pivotal role in unlocking the full potential of this transformative technology. By harnessing the combined power of Google’s Vertex AI and foundation models like Gemini, organizations can integrate their fragmented data sources, ensuring that generative AI models have access to a unified, comprehensive, and up-to-date view of enterprise data. This approach not only enables more accurate and contextual outputs from language models but also facilitates compliance with data governance and privacy regulations. With data virtualization acting as the backbone, enterprises can confidently leverage the capabilities of Vertex AI to develop innovative applications, streamline complex processes, and deliver personalized experiences to customers and employees alike, paving the way for a future where generative AI is deeply embedded in everyday enterprise operations and interactions with customers.

Learn more about Google Cloud’s open and innovative generative AI partner ecosystem. To get started with Denodo on Google Cloud, check out these listings on Marketplace.

AI Generated Robotic Content

Recent Posts

10 Podcasts That Every Machine Learning Enthusiast Should Subscribe To

Podcasts are a fun and easy way to learn about machine learning.

8 hours ago

o1’s Thoughts on LNMs and LMMs

TL;DR We asked o1 to share its thoughts on our recent LNM/LMM post. https://www.artificial-intelligence.show/the-ai-podcast/o1s-thoughts-on-lnms-and-lmms What…

8 hours ago

Leading Federal IT Innovation

Palantir and Grafana Labs’ Strategic PartnershipIntroductionIn today’s rapidly evolving technological landscape, government agencies face the…

9 hours ago

How Amazon trains sequential ensemble models at scale with Amazon SageMaker Pipelines

Amazon SageMaker Pipelines includes features that allow you to streamline and automate machine learning (ML)…

9 hours ago

Orchestrating GPU-based distributed training workloads on AI Hypercomputer

When it comes to AI, large language models (LLMs) and machine learning (ML) are taking…

9 hours ago

Cohere’s smallest, fastest R-series model excels at RAG, reasoning in 23 languages

Cohere's Command R7B uses RAG, features a context length of 128K, supports 23 languages and…

10 hours ago