Categories: FAANG

Build trust and context for AI with lineage, now at column-level granularity

Effective AI systems operate on a foundation of context and continuous trust. When you use Dataplex Universal Catalog, Google Cloud’s unified data governance platform, the metadata that describes your data is no longer static — it’s where your AI applications can go to know where to find data and what to trust.

But when you have complex data pipelines, it’s easy for your data’s journey to become obscured, making it difficult to trace information from its origin to its eventual impact. To solve this, we are extending Dataplex lineage capabilities from object-level to column-level, starting with support for BigQuery. 

“To power our AI strategy, we need absolute trust in our data. Column-level lineage provides that. It’s the foundation for governing our data responsibly and confidently.” – Latheef Syed – AVP, Data & AI Governance Engineering at Verizon

While object-level lineage tracks the top-level connections between entire tables, column-level lineage charts the specific, granular path of a single data column as it moves and transforms. With that, we are now providing a dynamic and granular map to govern your data-to-AI ecosystem, so you can ground your agentic AI applications in context. Lineage is upgraded to Column-level at no extra cost.

Answering critical questions about your data

Data professionals often need precise answers about the complex relationships in their BigQuery datasets. Column-level lineage provides a graph of data flows that you can trace to find these answers quickly. Now you can:

  • Confirm that a column used in your AI models originates from an authoritative source

  • Understand how changes to one column affect other columns downstream before you make a modification

  • Trace the root cause of an issue with a column by examining its upstream transformations

  • Verify that sensitive data at the column level is used correctly throughout your organization

“Column-level lineage takes the trusted map of our data ecosystem to the next level. It’s the precision tool we need to fully understand the impact of a change, trace a problem to its source, and ensure compliance down to the most granular detail.” – Arvind Rajagopalan – AVP, Data / AI & Product Engineering at Verizon

Explore lineage visually

Dataplex now provides an interactive, visual representation of column-level lineage relationships. You can select a single column in a table to see a graph of all its upstream and downstream connections. As you navigate the graph at the asset level, you can drill down to the column level to verify which specific columns are affected by a process. You can also visualize the direct lineage paths between the columns of two different assets, giving you a focused view of their relationship.

Column-level tracing for AI models

Tables used for AI and ML model training often have data coming from different sources and taking different paths, and it’s important to have granular visibility into the data’s journey. For example, in complex AI/ML feature tables, a single table for model training may contain many columns. Column-level lineage can verify that the one column originates from a trusted, audited financial system, while another one comes from ephemeral web logs. Table-level lineage would obscure this critical distinction, treating all features with the same level of trust.

Powering context-aware AI agents

More companies are developing AI agents to automate tasks and answer complex questions about their data, and these agents require a deep understanding of the business and organizational context to be effective. The granular metadata provided by column-level lineage supplies this necessary context. For example, it can allow the agent to distinguish between similarly named metrics. Tracing each column’s path, including its frequency of usage, and freshness, it gives context to the agent on the importance of a column if affected by a change, or severity of impact when troubleshooting. By grounding AI agents in a rich, factual map of your data assets and their relationships, you can build more accurate and reliable agentic workflows.

Get started

You can start using column-level lineage for BigQuery today in Dataplex.

AI Generated Robotic Content

Recent Posts

We may have a new SOTA open-source model: ERNIE-Image Comparisons

Base model is definitely SOTA, can even easily compete with closed-source ones in terms of…

2 hours ago

Navigating the generative AI journey: The Path-to-Value framework from AWS

Generative AI is reshaping how organizations approach productivity, customer experiences, and operational capabilities. Across industries,…

2 hours ago

The Surprising MacBook Neo Competitor You’ve Never Heard Of

In many ways, the HP OmniBook 5 is a better budget laptop than the MacBook…

3 hours ago

Tiny cameras in earbuds let users talk with AI about what they see

University of Washington researchers developed the first system that incorporates tiny cameras in off-the-shelf wireless…

3 hours ago

Update: Distilled v1.1 is live

We've pushed an LTX-2.3 update today. The Distilled model has been retrained (now v1.1) with…

1 day ago

How to Implement Tool Calling with Gemma 4 and Python

The open-weights model ecosystem shifted recently with the release of the

1 day ago