Risk mitigation is one of the key factors in the financial domain and one big risk is fraudulent transactions that can lead to a breach of data and systems. The number of fraudulent transactions is fewer when compared with legitimate transactions and that can make the data highly skewed. Can you predict fraud in the biased data with good accuracy? Do you need to use different sampling techniques and can they boost the accuracy? The answer is Yes to all of these questions, and we demonstrate ways to handle skewed data using different sampling techniques and to generate accurate predictions with different statistical algorithms.
Our goal is to enable developers to use different techniques like Bagging and Boosting to draw a balance between accuracy and computation power and use undersampling and oversampling (SMOTE) techniques.