1st Time Payments Analysis

About 40% of our transactions are cashless - meaning we are charging clients and paying majority of the charge to the driver every week (withholding our 15-25% commissions).

We always pay the driver and take the risk for collecting the funds from the client.

Luckily, we have basic filters to block users from making multiple fraudulent transactions and find groups of fraudulent users using the same devices or cards. Now the next step is to stop people who are doing fraudulent transaction for the first time.

Outcome and task description Based on the sample data please come up with the top 2-5 developments that should be done to reduce the percent of failed payments (state “is_successful_payment” as 0).

Keeping in mind our developer team is small (3-4 people), describe why you picked exactly those.

If some parameters are missing from the list below, you can presume we collect all reasonable data that we possibly can, while using the platform as a rider or driver.

Sample data

Here is a dump of 1st time credit card orders (worldwide): ZIP, CSV.

Data includes some meta-data on users who make their 1st finished order with credit card as a payment method, as well as meta data on the transaction itself.

Field legend:

created – time when the 1st time order request was created.
device_name – name of the device used to make order
device_os_version – version of the device OS
country – 2 char country code
city_id – internal system city ID (not relevant which one is which)
lat – latitude of the pickup spot for the order
lng – longitude of the pickup spot for the order
real_destination_lat – latitude of the destination for the order
real_destination_lng – longitude of the destination for the order
user_id – internal user ID
order_id – internal order ID
order_try_id – internal order try ID (order tries happen before client and driver are matched to an order)
distance – driver distance to the client pickup location, in meters
ride_distance – trip distance in meters
price – price charged to client, can be lower than “ride_price” if client had a discount, currencies vary and are undefined
ride_price – calculated price of the final trip, currencies vary and are undefined
price_review_status – “Price review” is when we send “ride_price” to be audited by human to check for system errors. 99% of orders are final and should have “ok” already set. There might be some that are still in pending states, most likely you can discard those.
price_review_reason – automatic or manual reason for the price review to be requested.
is_successful_payment – 1 means order was charged successfully, 0 mean it has failed (including after all attempts to re-charge)
name – card details, irrelevant.
card_bin – details on card BIN.
failed_attempts – number of failed order attempts before this 1st finished order. __

DataFrameas

df

variable

SELECT * FROM '1st_adyen_rides-success-and-fail.csv'
limit 5;

Loading Libraries

## REQUIRED LIBRARIES
# For data wrangling 
import numpy as np
import pandas as pd

# For visualization
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
pd.options.display.max_rows = None
pd.options.display.max_columns = None


from pycaret.classification import *

##!pip install pandas scikit-learn xgboost

Read Dataset

# Read the data frame
df = pd.read_csv('1st_adyen_rides-success-and-fail.csv')

df.shape

There are 22 columns and 304K + raws in the data set

df.head()

DataFrame Information

df.info()

There are 7 categorical variables and 15 numerical variables

Summary Statistics

# Basic statistics
# Set display format to avoid scientific notation
pd.options.display.float_format = '{:.2f}'.format

# Display summary statistics again
summary_statistics = df.describe()
summary_statistics.T
#print(df.describe())

Missing Values Check

‌
‌
‌

1st Time Payments Analysis

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}1st Time Payments Analysis

Loading Libraries

Read Dataset

DataFrame Information

Summary Statistics

Missing Values Check

1st Time Payments Analysis