12ABa yX HKUjrAcsf1rxHJeQ
From the Ukraine war and COVID-19 pandemic to preventing wildfires in California and facilitating cancer research — Palantir’s impact in supporting mission critical problems is widely known. What may be less apparent is the effort it takes for our products to even be in the room when it comes to enabling solutions for these real-world challenges.
Most institutions we work with require us to meet and exceed specific requirements before they are willing to trust our platforms with their most sensitive data — this is something we have invested 20 years of engineering into and continuously adapt based on learnings in the field.
We have seen organizations approach data protection assessments of our product from a variety of levels — some work towards the lower bound of data protection and privacy regulations and laws, others have built their brand on security and privacy.
In this post, we will cover how we use privacy first principles to approach data protection broadly and in Palantir Foundry throughout the data lifecycle — from the moment it lands in the platform, right through to when that data should be deleted.
While there are many privacy frameworks, we’ll use the Fair Information Practice Principles (FIPPs) as one of the most widely accepted privacy frameworks. These were first proposed in the 1973 report “Records, Computers and the Rights of Citizens” by the Secretary’s Advisory Committee on Automated Personal Data Systems, U.S. Department of Health, Education, and Welfare — with a long history of evolution and adaptions since, and push for core principles including:
What may be surprising is that these privacy concepts existed well before mobile phones, the internet, GPS, and social media — the root of much of the privacy community’s consternation in the last few decades. Yet, in each new iteration of data protection regulations — whether it’s HIPAA, GDPR, LGPD, CCPA/CPRA, or others — the FIPPs tend to lend a consistent set of principles across these ever-changing assemblage of acronyms.
We are often asked how our products comply with different data protection regulations. More specifically:
Our approach has enabled us to quickly ensure our clients not only meet the base privacy requirements relevant to them, but also achieve more ambitious privacy objectives.
Data lifecycles normally start from collection and ends with deletion. Since Palantir’s products do not collect data and instead support our customers in the processing of their data, we’ll approach this topic starting from the data ingestion phase. Below we outline some distinct phases in the data lifecycle that prompt data protection needs:
Organizations often have steps prior to the data actually landing onto the platform.
Cataloging Data Governance Requirements [S, A&A, UL, DM]
Many organizations require identifying data protection and governance requirements prior to the data being used and do so with data catalogs. Capturing this information within the platform allows the data governance instructions to sit alongside the data, streamlining the process for getting context and using the data.
Once data has been approved, it’s time to bring the data onto the platform.
Setting Access Controls [S, UL]
Organizations typically protect sensitive data by restricting user access, sometimes creating a sandbox for review before sharing data more broadly. Using granular access controls, markings, and restricted views, administrators control which users and groups can access what data and with what roles.
Tagging Data [S, UL, DM]
As described in Metadata Management for Data Protection, whether capturing PII, country-specific tags, or data protection metadata, Foundry enables users to tag data upfront which propagate everywhere the data goes across the platform. This gives users visibility into characteristics and metadata about the sensitive data.
Next, users can add some monitoring on where that sensitive data flows throughout the platform.
Detecting PII and Sensitive Data [S, UL]
Upon landing in the platform, Foundry Inference can be configured to allow administrators to set organization-specific definitions of sensitive data to alert, triage, and track sensitive data. For instance, when PII is detected, the platform can immediately lock down or alert reviewers to check the dataset to ensure it is only in authorized spaces and accessible by appropriate users.
After data is ingested, tagged, and properly landed onto the platform, the next step is to ensure the necessary data minimization strategies, data quality checks, and access controls are applied everywhere the data goes.
De-identifying and Aggregating Data [DM, UL]
A common method of minimizing data and enforcing “need-to-know” access, while still allowing users to leverage the utility of data, is to prepare and transform the data to different levels of granularity. Some strategies:
Monitoring Data Quality [DQ&I]
As we describe in our earlier post Trust in Data (Palantir Explained, #4), Foundry allows users to leverage tools to make sure data is accurate, timely, and complete when representing individuals.
Validating Permissions [S]
As artifacts and resources are built, integrated, and joined together, it is important to continually monitor and check appropriate permissions and that they adhere to relevant policies. The Data Lineage tool gives all users, including data protection leads and platform administrators, the ability to inspect who has access to what data throughout the platform.
As data is handled by users, it is important to ensure accountable use of that data throughout the platform.
Capturing Purposes of Sensitive Data Use or Actions (e.g., Export and Downloads) [PS, A&A, UL]
For sensitive data, ensuring users are only using data for approved purposes becomes vital. This can be done with Checkpoints and with Purpose-based Access Controls at Palantir (Palantir Explained, #2), where all access to data can be traced to why the user needs it and matched to the authorized purposes. Another common configuration is setting up justifications or acknowledgements prior to data being exported from the platform, where these interactions are logged and audited directly on the platform.
Auditing User Interactions [A&A]
Foundry captures audit logs, which has information such as who performed what action when and where, to enable monitoring the appropriate use of sensitive data. Organizations should monitor security audit logs to identify anomalous behavior as well as understand any unauthorized access patterns as described in Building Software for a Zero Trust World.
Tracking Uses for Data Subject Rights [IP, T]
From models to applications to reports, all artifacts are registered and can be visualized through Foundry’s Data Lineage capability, meaning organizations can trace how data from any data subject is used for varying purposes on the platform. This makes workflows such as creating Data Subject Requests to help data subjects understand how their data is used by an organization possible.
When data finally reaches its end of life, whether through the data no longer being useful or due to governance policies, it is eventually time for deletion.
Deleting Data [DM]
Foundry provides administrators tools to set retention dates and delete data to adhere to legal hold or regulatory requirements, as well as for Right to be Forgotten requests. Leveraging Palantir’s data lineage tracking, Foundry allow administrators to trace data use across the platform and perform necessary deletions.
Decommissioning Projects [DM]
Whether for a migration or a decommissioning of entire use cases, Foundry has deletion protocols that deletes all resources through each layer of Foundry starting with access controls, then the application, storage, backup, and cloud infrastructure layers.
It can be overwhelming to think about the best way to ensure compliance and establish necessary policies when handling sensitive data, but Foundry provides the technical frameworks and tools to support best practices within the platform to complement organizational policies. Whether supporting health data for critical hospital operations to personnel data for the Department of Defense or marketing data for European customers, Palantir has worked with some of the most sensitive and protected data, while further supporting differentiated capabilities requiring best in class security and privacy, such as secure collaboration for defense to data sharing for health research.
While many organizations aim for compliance, we also know that regulations will often lag against the pace of technology, so we have invested in building privacy protective technologies from the first principles of privacy. Over time, we have seen how this has paid off in not only future-proofing our technology and architecture, but also continuing to push the boundaries of what is possible with our clients.
Alice Yu, Privacy & Civil Liberties Commercial and Public Health Lead, Palantir Technologies
Protecting Data with Privacy First Principles was originally published in Palantir Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.
Matrices are a key concept not only in linear algebra but also with regard to…
This paper delves into the challenging task of Active Speaker Detection (ASD), where the system…
Based on original post by Dr. Hemant Joshi, CTO, FloTorch.ai A recent evaluation conducted by…
As AI creates opportunities for business growth and societal benefits, we’re working to reduce their…
PlayStation characters may one day engage you in theoretically endless conversations, if a new internal…
The latest 15-inch MacBook Air is bluer and better than ever before—and it dropped in…