Food Claims Process Case Study
Report created by Tyler Chang; Dataset provided by DataCamp
Overview of the Report
- Company Background and Primary Questions
- Overview of the Dataset
- Data Validation
- Data Exploration and Visualization
- Recommendations
Company Background and Primary Questions
Vivendo is a fast food chain in Brazil with over 200 outlets. As with many fast food establishments, customers make claims against the company. For example, they blame Vivendo for suspected food poisoning. The legal team, who processes these claims, is currently split across four locations. The new head of the legal department wants to see if there are differences in the time it takes to close claims across the locations.
This report provides answers to the following questions:
- How does the number of claims differ across locations?
- What is the distribution of time to close claims?
- How does the average time to close claims differ by location?
1 hidden cell
Overview of the Dataset
The dataset originally contained 97 rows and 8 columns. The eight columns are as follows:
- Claim ID : Unique identifiers for each claims
- Time to Close : Indicates the number of days it took for a claim to be closed
- Claim Amount : Initial value of the claim in the currency of Brazil
- Amount Paid : Total amount paid after the claim was closed in the currency of Brazil
- Location : Location of the claim
- Individuals on Claim : Number of individuals on a claim
- Linked Cases : Indicates whether a claim is believed to be linked to other claims
- Cause : Cause of the food poisoning injuries
Data Validation
96 rows and 8 columns have been maintained, with several adjustments having been made.
The Cause column originally contained mostly empty values, all of which have been replaced with the word 'Unknown'.
The Claim Amount column's entries were amended to only include the digits indicating the values rounded to nearest whole number.
In the Time to Close column, a negative value was found. Since it is unknown whether the error is the negative sign alone or also a mistyped digit, the row containing the negative has been removed.
Finally, it was confirmed that there were no missing values in the table.
5 hidden cells
The first six rows of the data table used for all subsequent data exploration and visualization is shown below.
2 hidden cells
Data Exploration and Visualization
How does the number of claims differ across locations?
There are four locations in the dataset-Fortaleza, Natal, Recife, and Sao Luis. The number of claims by location ranged from 21 in Natal to 29 in Sao Luis. Though Sao Luis does have the highest number of claims, the overall range of 8 being relatively small suggests that location alone is not likely to be a definitive indicator of the number of claims.
1 hidden cell
Number of Claims by Location
1 hidden cell
What is the distribution of time to close claims?
The majority of claims are closed within 1500 days, with 50% of claims closing within approximately 700 days, as indicated by the yellow line. The average time to close, indicated by the red line, is also under 1000 days but is notably higher than the midpoint of the closing times. This, alongside the range of closing times being 3562 days, suggests that there are multiple instances where the closing time was much longer than expected. To better understand where these unusually long closing times originate, the average time to close by location must be examined.