KNIME for Finance: Fraud detection using a supervised ML model | by Thor L | Low Code for Data Science

DATA STORIES | FINANCE | KNIME ANALYTICS PLATFORM

Learn to detect fraud utilizing a Random Forest mannequin

Photograph by Ibrahim Ahmed on Unsplash.

That is a part of a sequence of articles to indicate you options to widespread finance duties associated to monetary planning, accounting, tax calculations, and auditing issues all applied with the low-code KNIME Analytics Platform.

Bank card fraud detection stands out as an ongoing problem to precisely establish all new fraud patterns. Datasets containing fraud examples are uncommon, and once they do exist, they typically embody a restricted variety of outdated circumstances. This shortage makes fraud detection significantly difficult, because it should repeatedly adapt to the evolving ways of fraudsters.

There are two approaches to fraud detection:

Basic machine studying based mostly predictions, when your dataset accommodates sufficient fraud examples
Outlier detection based mostly strategies, when your dataset doesn’t include a enough variety of fraud examples

The dataset that we are going to use accommodates a small p.c of fraudulent transactions. Primarily based on these examples, we’ll implement the basic machine-learning based mostly strategy for fraud detection for this text.

Within the subsequent couple articles, we’ll present the best way to implement fraud detection algorithms utilizing outlier detection based mostly strategies.

No matter your knowledge scenario is, this sequence will present you the way KNIME Analytics Platform presents a low-code answer for this drawback. It may possibly allow monetary groups to automate knowledge consumption from numerous sources and leverage superior analytics to detect fraudulent transactions, with out the necessity for a coding background.

On this article on fraud detection, you’ll discover ways to use the Random Forest supervised studying algorithm to assist establish fraudulent transactions. Watch the video for an summary.

Bank card transactions can basically be divided into two classes: reputable and fraudulent. The duty at hand is to precisely establish and flag fraudulent transactions to make sure that a small minority of flagged transactions are reputable.

The method of fraud detection typically includes a number of handbook and automatic steps to investigate transaction patterns, buyer habits, and different related components. For our functions, we’ll solely concentrate on the automation a part of detection by coaching a mannequin on a labeled dataset and making use of it to a brand new transaction to simulate incoming knowledge from an out of doors knowledge supply.

We use a preferred dataset out there from Kaggle referred to as Credit Card Fraud Detection. This dataset consists of actual, anonymized transactions made by bank cards in September 2013 by European cardholders. It consists of 284,807 transactions over two days, containing 492 fraudulent transactions. The dataset represents a extreme class imbalance between the ‘good’ (0) and ‘frauds’ (1), the place ‘frauds’ account for less than 0.172% of the information.

The dataset accommodates 31 columns:

A key characteristic wanted for our coaching is ‘Class’ as we’d like labeled knowledge for a supervised coaching algorithm.

The method for creating our classification mannequin follows the steps under. Even when there may be knowledge coming from a number of sources, the general course of doesn’t change:

Create/import a labeled coaching dataset
Partition the information
Practice the mannequin
Consider mannequin efficiency
Import the brand new, unseen transactions
Deploy the mannequin and feed the brand new transactions in
Notify if any fraudulent transactions are labeled.

All workflows used on this article can be found publicly and free to obtain on the KNIME Neighborhood Hub. You will discover the workflows on the KNIME for Finance area below Fraud Detection within the Random Forest section.

The primary workflow covers coaching our mannequin. You’ll be able to view and obtain the coaching workflow Random Forest Model Training from the KNIME Neighborhood Hub.

Source link

Is Python’s autoML capable of handling complex time series data? | by Katy | May, 2025

Faster Models with Graph Fusion: How Deep Learning Frameworks Optimize Your Computation | by Arik Poznanski | May, 2025

Who Am I and Why I Write About Machine Learning and AI | M001 | Mehul Ligade | by Mehul Ligade | May, 2025

bghhfhh – محدثه غیور – Medium

AI-Powered Customer Insights: Understanding Your Audience for Better Branding by Daniel Reitberg – Daniel David Reitberg

From RGB to HSV — and Back Again

One $28, Under-Appreciated Microsoft App Could Save You Thousands of Dollars

MicroStrategy Announces New Version of Auto AI Business Intelligence Bot

Most Popular

Get Core Business Tools in One Suite: Microsoft Office 2019 for Windows or Mac Starting at $30

Small Business Administration: Surging Application Approvals

BOOK DRAGON: BOOK GENRE CLASSIFICATION USING MACHINE LEARNING | by Ishita Joshi | Apr, 2025

Our Picks

How Likely Is a Six Nations Grand Slam in 2025? | by Harry Snart | Jan, 2025

Building Custom Text Classifiers with Mistral AI Classifier Factory: A Technical Guide | by Vivek Tiwari | Apr, 2025

Artificial Intelligence Training: Elevate Your Career with Weskill’s Premier Programs | by Weskill | Apr, 2025

KNIME for Finance: Fraud detection using a supervised ML model | by Thor L | Low Code for Data Science | May, 2025

DATA STORIES | FINANCE | KNIME ANALYTICS PLATFORM

Learn to detect fraud utilizing a Random Forest mannequin

Related Posts