メインコンテンツへスキップ

ホーム Python

コース

Pythonで学ぶ不正検知

中級スキルレベル

更新日 2024/08

Pythonで不正検知の方法を学びます。

コースを無料で開始

PythonMachine Learning

4時間

16 ビデオ

57 演習

4,800 XP

22,042

修了証明書

何千もの企業の従業員が支持

チームのトレーニングを担当していますか？

Businessをお試しください

コース説明

一般的な組織は、年間売上の約5%を不正によって失っていると推定されています。本コースでは、データを使って不正と戦う方法を学びます。たとえば、過去の事例に似た不正行為を検知するために教師あり学習アルゴリズムを適用する方法や、新しいタイプの不正を見つけるための教師なし学習手法を学びます。さらに、不正分析では、不正と非不正を分類する際に極端に不均衡なデータセットに直面することがよくあります。本コースでは、その扱い方のテクニックも身につけます。技術的な内容と理論的な洞察を組み合わせ、実際に不正検知モデルを実装する方法をハンズオンで紹介します。加えて、実務経験に基づくヒントやアドバイスを提供し、不正分析で陥りやすいミスを防ぐ手助けをします。

前提条件

Unsupervised Learning in Python Supervised Learning with scikit-learn

1

Introduction and preparing your data

In this chapter, you'll learn about the typical challenges associated with fraud detection, and will learn how to resample your data in a smart way, to tackle problems with imbalanced data.

Introduction to fraud detection

Checking the fraud to non-fraud ratio

Plotting your data

Increasing successful detections using data resampling

Resampling methods for imbalanced data

Applying SMOTE

Compare SMOTE to original data

Fraud detection algorithms in action

Exploring the traditional way to catch fraud

Using ML classification to catch fraud

Logistic regression combined with SMOTE

Using a pipeline

チャプターを開始

2

Fraud detection using labeled data

Now that you're familiar with the main challenges of fraud detection, you're about to learn how to flag fraudulent transactions with supervised learning. You will use classifiers, adjust them, and compare them to find the most efficient fraud detection model.

Review of classification methods

Natural hit rate

Random Forest Classifier - part 1

Random Forest Classifier - part 2

Performance evaluation

Performance metrics for the RF model

Plotting the Precision Recall Curve

Adjusting your algorithm weights

Model adjustments

Adjusting your Random Forest to fraud detection

GridSearchCV to find optimal parameters

Model results using GridSearchCV

Ensemble methods

Logistic Regression

Voting Classifier

Adjust weights within the Voting Classifier

チャプターを開始

3

Fraud detection using unlabeled data

This chapter focuses on using unsupervised learning techniques to detect fraud. You will segment customers, use K-means clustering and other clustering algorithms to find suspicious occurrences in your data.

Normal versus abnormal behavior

Exploring your data

Customer segmentation

Using statistics to define normal behavior

Clustering methods to detect fraud

Scaling the data

K-means clustering

Elbow method

Assigning fraud versus non-fraud

Detecting outliers

Checking model results

Other clustering fraud detection methods

Assessing smallest clusters

Checking results

チャプターを開始

4

Fraud detection using text

In this final chapter, you will use text data, text mining, and topic modeling to detect fraudulent behavior.

Using text data

Word search with dataframes

Using list of terms

Creating a flag

Text mining to detect fraud

Removing stopwords

Cleaning text data

Topic modeling on fraud

Create dictionary and corpus

Flagging fraud based on topics

Interpreting the topic model

Finding fraudsters based on topic

チャプターを開始

Pythonで学ぶ不正検知

コース完了

修了証明書を取得

この修了書をLinkedInや履歴書、CVに追加しましょう
ソーシャルメディアや人事評価で共有しましょう今すぐ登録

19百万人を超える学習者と共にPythonで学ぶ不正検知を始めましょう！

DataCamp for Mobileでデータスキルを磨きましょう

モバイルコースと毎日の 5 分間のコーディングチャレンジで、外出先でも進歩できます。