ML 13620 image001
In the first post of this three-part series, we presented a solution that demonstrates how you can automate detecting document tampering and fraud at scale using AWS AI and machine learning (ML) services for a mortgage underwriting use case.
In the second post, we discussed an approach to develop a deep learning-based computer vision model to detect and highlight forged images in mortgage underwriting.
In this post, we present a solution to automate mortgage document fraud detection using an ML model and business-defined rules with Amazon Fraud Detector.
We use Amazon Fraud Detector, a fully managed fraud detection service, to automate the detection of fraudulent activities. With an objective to improve fraud prediction accuracies by proactively identifying document fraud, while improving underwriting accuracies, Amazon Fraud Detector helps you build customized fraud detection models using a historical dataset, configure customized decision logic using the built-in rules engine, and orchestrate risk decision workflows with the click of a button.
The following diagram represents each stage in a mortgage document fraud detection pipeline.
We will now be covering the third component of the mortgage document fraud detection pipeline. The steps to deploy this component are as follows:
The following are prerequisite steps for this solution:
EVENT_TIMESTAMP
and EVENT_LABEL
.After you have the custom historical data files to train a fraud detector model, create an S3 bucket and upload the data to the bucket.
The next step towards building and training a fraud detector model is to define the business activity (event) to evaluate for the fraud. Defining an event involves setting the variables in your dataset, an entity initiating the event, and the labels that classify the event.
Complete the following steps to define a docfraud event to detect document fraud, which is initiated by the entity applicant mortgage, referring to a new mortgage application:
docfraud
as the event type name and, optionally, enter a description of the event.applicant_mortgage
as the entity type name and, optionally, enter a description of the entity type.S3://your-bucket-name/example dataset filename.csv
.Variables represent data elements that you want to use in a fraud prediction. These variables can be taken from the event dataset that you prepared for training your model, from your Amazon Fraud Detector model’s risk score outputs, or from Amazon SageMaker models. For more information about variables taken from the event dataset, see Get event dataset requirements using the Data models explorer.
fraud
as the name. This label corresponds to the value that represents the fraudulent mortgage application in the example dataset.legit
. This label corresponds to the value that represents the legitimate mortgage application in the example dataset.The following screenshot shows our event type details.
The following screenshot shows our variables.
The following screenshot shows our labels.
After you have loaded the historical data and selected the required options to train a model, complete the following steps to create a model:
mortgage_fraud_detection_model
as the model’s name and an optional description of the model.docfraud
. This is the event type that you created earlier.fraud
, which corresponds to the value that represents fraudulent events in the example dataset.legit
, which corresponds to the value that represents legitimate events in the example dataset.Amazon Fraud Detector creates a model and begins to train a new version of the model.
On the Model versions page, the Status column indicates the status of model training. Model training that uses the example dataset takes approximately 45 minutes to complete. The status changes to Ready to deploy after model training is complete.
After the model training is complete, Amazon Fraud Detector validates the model performance using 15% of your data that was not used to train the model and provides various tools, including a score distribution chart and confusion matrix, to assess model performance.
To view the model’s performance, complete the following steps:
sample_fraud_detection_model
), then choose 1.0. This is the version Amazon Fraud Detector created of your model.After you have reviewed the performance metrics of your trained model and are ready to use it generate fraud predictions, you can deploy the model:
sample_fraud_detection_model
, and then choose the specific model version that you want to deploy. For this post, choose 1.0.On the Model versions page, the Status shows the status of the deployment. The status changes to Active when the deployment is complete. This indicates that the model version is activated and available to generate fraud predictions.
After you have deployed the model, you build a detector for the docfraud
event type and add the deployed model. Complete the following steps:
fraud_detector
for the detector name and, optionally, enter a description for the detector, such as my sample fraud detector.docfraud
. This is the event that you created in earlier.After you have created the Amazon Fraud Detector model, you can use the Amazon Fraud Detector console or application programming interface (API) to define business-driven rules (conditions that tell Amazon Fraud Detector how to interpret model performance score when evaluating for fraud prediction). To align with the mortgage underwriting process, you may create rules to flag mortgage applications according to the risk levels associated and mapped as fraud, legitimate, or if a review is needed.
For example, you may want to automatically decline mortgage applications with a high fraud risk, considering parameters like tampered images of the required documents, missing documents like paystubs or income requirements, and so on. On the other hand, certain applications may need a human in the loop for making effective decisions.
Amazon Fraud Detector uses the aggregated value (calculated by combining a set of raw variables) and raw value (the value provided for the variable) to generate the model scores. The model scores can be between 0–1000, where 0 indicates low fraud risk and 1000 indicates high fraud risk.
To add the respective business-driven rules, complete the following steps:
$docdraud_insightscore >= 900
high_risk
rule available for use in your detector. fraud
decline
$docdraud_insightscore >= 900
low_risk
rule with the following details: legit
approve
$docdraud_insightscore <= 500
medium_risk
rule with the following details: review needed
review
$docdraud_insightscore <= 900 and docdraud_insightscore >=500
These values are examples used for this post. When you create rules for your own detector, use values that are appropriate for your model and use case.
After the rules-based actions have been triggered, you can deploy an Amazon Fraud Detector API to evaluate the lending applications and predict potential fraud. The predictions can be performed in a batch or real time.
If you already have a fraud detection model in SageMaker, you can integrate it with Amazon Fraud Detector for your preferred results.
This implies that you can use both SageMaker and Amazon Fraud Detector models in your application to detect different types of fraud. For example, your application can use the Amazon Fraud Detector model to assess the fraud risk of customer accounts, and simultaneously use your PageMaker model to check for account compromise risk.
To avoid incurring any future charges, delete the resources created for the solution, including the following:
This post walked you through an automated and customized solution to detect fraud in the mortgage underwriting process. This solution allows you to detect fraudulent attempts closer to the time of fraud occurrence and helps underwriters with an effective decision-making process. Additionally, the flexibility of the implementation allows you to define business-driven rules to classify and capture the fraudulent attempts customized to specific business needs.
For more information about building an end-to-end mortgage document fraud detection solution, refer to Part 1 and Part 2 in this series.
https://preview.redd.it/j6qshjdiao7f1.jpg?width=1182&format=pjpg&auto=webp&s=9f5da751e086c7c3a8cd882f5b7648211daae50c https://reddit.com/link/1leexi9/video/bs096nikao7f1/player Link to the post: https://x.com/viccpoes/status/1934983545233277428 submitted by /u/LatentSpacer [link] [comments]
Editor’s Note: This post provides a detailed rebuttal of the multitude of misguided assertions presented…
Meetings play a crucial role in decision-making, project coordination, and collaboration, and remote meetings are…
The momentum of the Gemini 2.5 era continues to build. Following our recent announcements, we're…
By offering transparent tooling and clear implementation examples, OpenAI is pushing agentic systems out of…