p1
Managing large photo collections presents significant challenges for organizations and individuals. Traditional approaches rely on manual tagging, basic metadata, and folder-based organization, which can become impractical when dealing with thousands of images containing multiple people and complex relationships. Intelligent photo search systems address these challenges by combining computer vision, graph databases, and natural language processing to transform how we discover and organize visual content. These systems capture not just who and what appears in photos, but the complex relationships and contexts that make them meaningful, enabling natural language queries and semantic discovery.
In this post, we show you how to build a comprehensive photo search system using the AWS Cloud Development Kit (AWS CDK) that integrates Amazon Rekognition for face and object detection, Amazon Neptune for relationship mapping, and Amazon Bedrock for AI-powered captioning. We demonstrate how these services work together to create a system that understands natural language queries like “Find all photos of grandparents with their grandchildren at birthday parties” or “Show me pictures of the family car during road trips.”
The key benefit is the ability to personalize and customize search focus on specific people, objects, or relationships while scaling to handle thousands of photos and complex family or organizational structures. Our approach demonstrates that integrating Amazon Neptune graph database capabilities with Amazon AI services enables natural language photo search that understands context and relationships, moving beyond simple metadata tagging to intelligent photo discovery. We showcase this through a complete serverless implementation that you can deploy and customize for your specific use case.
This section outlines the technical architecture and workflow of our intelligent photo search system. As illustrated in the following diagram, the solution uses serverless AWS services to create a scalable, cost-effective system that automatically processes photos and enables natural language search.
The serverless architecture scales efficiently for multiple use cases:
The architecture combines several AWS services to create a contextually aware photo search system:
The system follows a streamlined workflow:
The complete source code is available on GitHub.
Before implementing this solution, ensure you have the following:
npm install -g aws-cdk)Download the complete source code from the GitHub repository. More detailed setup and deployment instructions are available in the README.
The project is organized into several key directories that separate concerns and enable modular development:
The solution uses the following key Lambda functions:
To deploy the solution, complete the following steps:
pip install -r requirements_neptune.txt
cdk bootstrap
cdk deploy
The system automatically handles face recognition, label detection, relationship resolution, and AI caption generation through the serverless pipeline, enabling natural language queries like “person’s mother with car” powered by Neptune graph traversals.
In this section, we discuss the key features and use cases for this solution.
With Amazon Rekognition, you can automatically identify individuals from reference photos, without manual tagging. Upload a few clear images per person, and the system recognizes them across your entire collection, regardless of lighting or angles. This automation reduces tagging time from weeks to hours, supporting corporate directories, compliance archives, and event management workflows.
By using Neptune, the solution understands who appears in photos and how they are connected. You can run natural language queries such as “Sarah’s manager” or “Mom with her children,” and the system traverses multi-hop relationships to return relevant images. This semantic search replaces manual folder sorting with intuitive, context-aware discovery.
Amazon Rekognition detects objects, scenes, and activities, and Neptune links them to people and relationships. This enables complex queries like “executives with company vehicles” or “teachers in classrooms.” The label hierarchy is generated dynamically and adapts to different domains—such as healthcare or education—without manual configuration.
Using Amazon Bedrock, the system creates meaningful, relationship-aware captions such as “Sarah and her manager discussing quarterly results” instead of generic ones. Captions can be tuned for tone (such as objective for compliance, narrative for marketing, or concise for executive summaries), enhancing both searchability and communication.
With the web UI, users can search photos using natural language, view AI-generated captions, and adjust tone dynamically. For example, queries like “mother with children” or “outdoor activities” return relevant, captioned results instantly. This unified experience supports both enterprise workflows and personal collections.
The following screenshot demonstrates using the web UI for intelligent photo search and caption styling.
Neptune scales to model thousands of relationships and label hierarchies across organizations or datasets. Relationships are automatically generated during image processing, enabling fast semantic discovery while maintaining performance and flexibility as data grows.
The following diagram illustrates an example person relationship graph (configuration-driven).
Person relationships are configured through JSON data structures passed to the initialize_relationship_data() function. This configuration-driven approach supports unlimited use cases without code modifications—you can simply define your people and relationships in the configuration object.
The following diagram illustrates an example label hierarchy graph (automatically generated from Amazon Rekognition).
Label hierarchies and co-occurrence patterns are automatically generated during image processing. Amazon Rekognition provides category classifications that create the belongs_to relationships, and the appears_with and co_occurs_with relationships are built dynamically as images are processed.
The following screenshot illustrates a subset of the complete graph, demonstrating multi-layered relationship types.
The relationship graph uses a flexible configuration-driven approach through the initialize_relationship_data() function. This mitigates the need for hard-coding and supports unlimited use cases:
The label relationship database is created automatically during image processing through the store_labels_in_neptune() function:
With these functions, you can manage large photo collections with complex relationship queries, discover photos by semantic context, and find themed collections through label co-occurrence patterns.
Consider the following performance and scalability factors:
This solution implements comprehensive security measures to protect sensitive image and facial recognition data. The system encrypts data at rest using AES-256 encryption with AWS Key Management Service (AWS KMS) managed keys and secures data in transit with TLS 1.2 or later. Neptune and Lambda functions operate within virtual private cloud (VPC) subnets, isolated from direct internet access, and API Gateway provides the only public endpoint with CORS policies and rate limiting. Access control follows least-privilege principles with AWS Identity and Access Management (IAM) policies that grant only minimum required permissions: Lambda functions can only access specific S3 buckets and DynamoDB tables, and Neptune access is restricted to authorized database operations. Image and facial recognition data stays within your AWS account and is never shared outside AWS services. You can configure Amazon S3 lifecycle policies for automated data retention management, and AWS CloudTrail provides complete audit logs of data access and API calls for compliance monitoring, supporting GDPR and HIPAA requirements with additional Amazon GuardDuty monitoring for threat detection.
To avoid incurring future charges, complete the following steps to delete the resources you created:
aws s3 rm s3://YOUR_BUCKET_NAME –recursive
cdk destroy
aws rekognition delete-collection --collection-id face-collection
This solution demonstrates how Amazon Rekognition, Amazon Neptune, and Amazon Bedrock can work together to enable intelligent photo search that understands both visual content and context. Built on a fully serverless architecture, it combines computer vision, graph modeling, and natural language understanding to deliver scalable, human-like discovery experiences. By turning photo collections into a knowledge graph of people, objects, and moments, it redefines how users interact with visual data—making search more semantic, relational, and meaningful. Ultimately, it reflects the reliability and trustworthiness of AWS AI and graph technologies in enabling secure, context-aware photo understanding.
To learn more, refer to the following resources:
Built an open source LoRA for virtual clothing try-on on top of Flux Klein 9b…
AI deployment is changing.
Large Language Models (LLMs) can be adapted to extend their text capabilities to speech inputs.…
The US Justice Department disclosures give fresh clues about how tech companies handle government inquiries…
When a human says an event is "probable" or "likely," people generally have a shared,…
After a deeply introspective and emotional journey, I fine-tuned SDXL using old family album pictures…