The ability to quickly build and deploy machine learning (ML) models is becoming increasingly important in today’s data-driven world. However, building ML models requires significant time, effort, and specialized expertise. From data collection and cleaning to feature engineering, model building, tuning, and deployment, ML projects often take months for developers to complete. And experienced data scientists can be hard to come by.
This is where the AWS suite of low-code and no-code ML services becomes an essential tool. With just a few clicks using Amazon SageMaker Canvas, you can take advantage of the power of ML without needing to write any code.
As a strategic systems integrator with deep ML experience, Deloitte utilizes the no-code and low-code ML tools from AWS to efficiently build and deploy ML models for Deloitte’s clients and for internal assets. These tools allow Deloitte to develop ML solutions without needing to hand-code models and pipelines. This can help speed up project delivery timelines and enable Deloitte to take on more client work.
The following are some specific reasons why Deloitte uses these tools:
Additionally, these tools provide a comprehensive solution for faster workflows, enabling the following:
Vishveshwara Vasa, Cloud CTO for Deloitte, says:
“Through AWS’s no-code ML services such as SageMaker Canvas and SageMaker Data Wrangler, we at Deloitte Consulting have unlocked new efficiencies, enhancing the speed of development and deployment productivity by 30–40% across our client-facing and internal projects.”
In this post, we demonstrate the power of building an end-to-end ML model with no code using SageMaker Canvas by showing you how to build a classification model for predicting if a customer will default on a loan. By predicting loan defaults more accurately, the model can help a financial services company manage risk, price loans appropriately, improve operations, provide additional services, and gain a competitive advantage. We demonstrate how SageMaker Canvas can help you rapidly go from raw data to a deployed binary classification model for loan default prediction.
SageMaker Canvas offers comprehensive data preparation capabilities powered by Amazon SageMaker Data Wrangler in the SageMaker Canvas workspace. This enables you to go through all the phases of a standard ML workflow, from data preparation to model building and deployment, on a single platform.
Data preparation is typically the most time-intensive phase of the ML workflow. To reduce time spent on data preparation, SageMaker Canvas allows you to prepare your data using over 300 built-in transformations. Alternatively, you can write natural language prompts, such as “drop the rows for column c that are outliers,” and be presented with the code snippet necessary for this data preparation step. You can then add this to your data preparation workflow in a few clicks. We show you how to use that in this post as well.
The following diagram describes the architecture for a loan default classification model using SageMaker low-code and no-code tools.
Starting with a dataset that has details about loan default data in Amazon Simple Storage Service (Amazon S3), we use SageMaker Canvas to gain insights about the data. We then perform feature engineering to apply transformations such as encoding categorical features, dropping features that are not needed, and more. Next, we store the cleansed data back in Amazon S3. We use the cleaned dataset to create a classification model for predicting loan defaults. Then we have a production-ready model for inference.
Make sure that the following prerequisites are complete and that you have enabled the Canvas Ready-to-use models option when setting up the SageMaker domain. If you have already set up your domain, edit your domain settings and go to Canvas settings to enable the Enable Canvas Ready-to-use models option. Additionally, set up and create the SageMaker Canvas application, then request and enable Anthropic Claude model access on Amazon Bedrock.
We use a public dataset from kaggle that contains information about financial loans. Each row in the dataset represents a single loan, and the columns provide details about each transaction. Download this dataset and store this in an S3 bucket of your choice. The following table lists the fields in the dataset.
Column Name | Data Type | Description |
Person_age | Integer | Age of the person who took a loan |
Person_income | Integer | Income of the borrower |
Person_home_ownership | String | Home ownership status (own or rent) |
Person_emp_length | Decimal | Number of years they are employed |
Loan_intent | String | Reason for loan (personal, medical, educational, and so on) |
Loan_grade | String | Loan grade (A–E) |
Loan_int_rate | Decimal | Interest rate |
Loan_amnt | Integer | Total amount of the loan |
Loan_status | Integer | Target (whether they defaulted or not) |
Loan_percent_income | Decimal | Loan amount compared to the percentage of the income |
Cb_person_default_on_file | Integer | Previous defaults (if any) |
Cb_person_credit_history_length | String | Length of their credit history |
Data preparation can take up to 80% of the effort in ML projects. Proper data preparation leads to better model performance and more accurate predictions. SageMaker Canvas allows interactive data exploration, transformation, and preparation without writing any SQL or Python code.
Complete the following steps to prepare your data:
This is a recommended step to analyze the quality of the input dataset. The output of this report produces instant ML-powered insights such as data skew, duplicates in the data, missing values, and much more. The following screenshot shows a sample of the generated report for the loan dataset.
By generating these insights on your behalf, SageMaker Canvas provides you with a set of issues in the data that need remediation in the data preperation phase. To pick the top two issues identified by SageMaker Canvas, you need to encode the categorical features and remove the duplicate rows so your model quality is high. You can do both of these and more in a visual workflow with SageMaker Canvas.
loan_intent
, loan_grade
, and person_home_ownership
cb_person_cred_history_length
column because that column has the least predicting power, as shown in the Data Quality and Insights Report.You can also add another step to create an Amazon S3 destination for the dataset to scale the workflow for a large dataset. The following diagram shows the SageMaker Canvas data flow after adding visual transformations.
You have completed the entire data processing and feature engineering step using visual workflows in SageMaker Canvas. This helps reduce the time a data engineer spends on cleaning and making the data ready for model development from weeks to days. The next step is to build the ML model.
Amazon SageMaker Canvas provides a no-code end-to-end workflow for building, analyzing, testing, and deploying this binary classification model. Complete the following steps:
After the model is deployed, you can call it through the AWS SDK or AWS Command Line Interface (AWS CLI) or make API calls to any application of your choice to confidently predict the risk of a potential borrower. For more information about testing your model, refer to Invoke real-time endpoints.
To avoid incurring additional charges, log out of SageMaker Canvas or delete the SageMaker domain that was created. Additionally, delete the SageMaker model endpoint and delete the dataset that was uploaded to Amazon S3.
No-code ML accelerates development, simplifies deployment, doesn’t require programming skills, increases standardization, and reduces cost. These benefits made no-code ML attractive to Deloitte to improve its ML service offerings, and they have shortened their ML model build timelines by 30–40%.
Deloitte is a strategic global systems integrator with over 17,000 certified AWS practitioners across the globe. It continues to raise the bar through participation in the AWS Competency Program with 25 competencies, including Machine Learning. Connect with Deloitte to start using AWS no-code and low-code solutions to your enterprise.
These beard tools deliver a quality trim for all types of facial hair.
Artificial intelligence (AI) research, particularly in the machine learning (ML) domain, continues to increase the…
Training large language models (LLMs) models has become a significant expense for businesses. For many…
o3 solved one of the most difficult AI challenges, scoring 75.7% on the ARC-AGI benchmark.…
The Trump transition team is looking for “big changes” at NASA—including some cuts.
A new artificial intelligence (AI) model has just achieved human-level results on a test designed…