1 Secure AIML Reference Architecture.max 1000x1000 1
As we continue to see rapid AI adoption across the industry, organizations still often struggle to implement secure solutions because of the new challenges around data privacy and security.
We want customers to be successful as they develop and deploy AI, and that means carefully considering risk mitigation and proactive security measures.
When adopting AI, it is crucial to consider a platform-based approach, rather than focus solely on individual models.
A secure AI platform, like a secure storage facility, requires strong foundational cornerstones. These are infrastructure, data, security and responsible AI (RAI).
Responsible AI is essential for building trust in AI systems and ensuring that they are used in a way that is ethical and beneficial. By prioritizing fairness, explainability, privacy, and accountability, organizations can build AI systems that are both secure and trustworthy. To support our customers on their AI journey, we’ve provided the following design considerations.
Vertex AI provides a secured managed environment for building and deploying machine-learning models as well as accessing foundation models. However, building on Vertex requires thoughtful security design.
Secure AI/ML deployment reference architecture.
In the architecture shown above, you can see that we recommend designing AI application security with the four key cornerstones in mind.
You can see here that Virtual Private Cloud (VPC) is the foundation of the platform. It isolates AI resources from the public internet, creating a private network for sensitive data, applications and models on Google Cloud.
We recommend using Private Services Connect (PSC) as the private endpoint type for Vertex AI resources (such as Notebooks and Model Endpoints.) These allow Vertex AI resources to be deployed into private VPC subnets to prevent direct internet access. It also allows applications deployed into private subnets within the VPC to securely make inference to the AI models privately.
VPC Service Control perimeters and Firewall Rules are used in addition to Identity and Access Management (IAM) to authorize network communication and block unwanted connections. This enhances the security of cloud resources and data for AI processing. Access levels can be defined based on IP addresses, device policies, user identity, and geographical location to control access to protected services and projects.
If there are requirements for private resources like notebooks to access the internet for updates, we recommend building a Cloud NAT, which can enable instances in private subnets to access the internet (such as for software updates) without exposure to direct inbound connections.
The reference architecture uses a Cloud Load Balancer (LB) as the network entry point for AI applications. The LB distributes traffic securely across multiple instances, ensuring high availability and scalability. Integrated with Cloud Armor, it protects against denial of service (DDoS) and web attacks.
reCAPTCHA Enterprise can prevent fraud and abuse, whether perpetrated by bots or humans. LBs inherent scalability can effectively mitigate DDoS attacks, as we saw when Google Cloud stopped one of the largest DDoS attacks ever seen.
Model Armor is also used to enhance the security and safety of AI applications by screening foundation model prompts and responses for different security and safety risks. It can perform functions such as filters for dangerous harassment, and hate speech. Additionally, Model Armor can identify malicious URLs in prompts and responses as well as injection and jailbreak attacks.
In this design, Chrome Enterprise Premium is used to implement a Zero Trust model, removing implicit trust by authenticating and authorizing every user and device for remote access to AI applications on Google Cloud. Chrome Enterprise Premium enforces inspection and verification of all incoming traffic, while a Secure Web Proxy manages secure egress HTTP/S traffic.
Sensitive Data Protection can secure AI data on Google Cloud by discovering, classifying, and protecting sensitive information and maintaining data integrity. Cloud Key Management is used to provide centralized encryption key management for Vertex AI model artifacts and sensitive data.
In addition to this reference architecture, it is important to remember to implement appropriate IAM roles when using Vertex AI. This enforces Vertex AI resource control and access for different states of the machine learning (ML) workflow. For example roles must be defined for data scientists, model trainers, model deployers etc.
Finally it is important to conduct regular security assessments and penetration testing to identify and address potential vulnerabilities in your Vertex AI deployments. Tools including Security Command Center, Google Security Operations, Dataplex and Cloud Logging can be used to enforce a secure security posture for AI/ML deployments on Google Cloud.
Building upon the general AI/ML security architecture we’ve discussed, Vertex-based ML workflows present specific security challenges at each state. Address these unique concerns when securing AI workloads following these recommendations:
Development and data ingestion: Begin with secure development by managing access with IAM roles, isolate environments in Vertex AI Notebooks, and secure data ingestion by authenticating pipelines and sanitizing inputs to prevent injection attacks.
Code and pipeline security: Use IAM to secure code repositories with Cloud Source Repositories for access control and implement branch protection policies. Secure CI/CD pipelines using Cloud Build to control build execution and artifact access with IAM. Use secure image sources, and conduct vulnerability scanning on container images.
Training and model protection: Protect model training environments by using private endpoints, controlling access to training data and monitoring for suspicious activity. Manage pipeline components with Container Registry, prioritize private registry access control and container image vulnerability scanning.
Deployment and serving: Secure model endpoints with strong authentication and authorization. Implement rate limiting to prevent abuse. Use Vertex Prediction for prediction serving, and implement IAM policies for model access control and input sanitization to prevent prompt injections.
Monitoring and governance: Continuous monitoring is key. Use Vertex Model Monitoring to set up alerts, detect anomalies, and implement data privacy safeguards.
Secure MLOPs reference architecture.
By focusing on these key areas within the Vertex AI MLOps workflow — from secure development and code management to robust model protection and ongoing monitoring — clients can significantly enhance the security of AI applications.
For highly sensitive customer data on Vertex AI, we recommend using Confidential Computing. It encrypts VM memory, generates ephemeral hardware-based keys unique to each VM that are unextractable, even by Google, and encrypts data in transit between CPUs/GPUs. This Trusted Execution Environment restricts data access to authorized workloads only.
It ensures data confidentiality, enforces code integrity with attestation, and it removes the operator and the workload owner from the trust boundary.
We encourage organizations to prioritize AI security on Google Cloud by applying the following key actions:
Ensure Google Cloud best practices have been adopted for data governance, security, infrastructure and RAI.
Implement security controls such as VPC Service Controls, encryption with CMEK and access control with IAM.
Use the RAI Toolkit to ensure a responsible approach to AI.
Use Google Safeguards to protect AI models.
Ensure the importance of data privacy and secure practices for data management.
Apply security best practices across the MLOps workflow.
Stay informed: Stay up to date with the latest resources, including the Secure AI Framework (SAIF).
By implementing these strategies, organizations can be empowered to use the benefits of AI while effectively mitigating risks, ensuring a secure and trustworthy AI ecosystem on Google Cloud. Reach out to Google’s accredited Partners to help you implement these practices for your business.
This post is divided into five parts: • Understanding the RAG architecture • Building the…
Archival data in research institutions and national laboratories represents a vast repository of historical knowledge,…
Imagine this common scenario: you have a detailed product requirements document for your next project.…
Google expanded Gemini's features, adding the popular podcast-style feature Audio Overviews to the platform.Read More
Wildfire season is coming. Here are the best disposable face coverings we’ve tested—and where you…
Inspired by the movements of a tiny parasitic worm, engineers have created a 5-inch soft…