Skip to content
1 hidden cell
predict hotel cancellation
Conclusion:
The Data
They have provided you with their bookings data in a file called hotel_bookings.csv
, which contains the following:
Column | Description |
---|---|
Booking_ID | Unique identifier of the booking. |
no_of_adults | The number of adults. |
no_of_children | The number of children. |
no_of_weekend_nights | Number of weekend nights (Saturday or Sunday). |
no_of_week_nights | Number of week nights (Monday to Friday). |
type_of_meal_plan | Type of meal plan included in the booking. |
required_car_parking_space | Whether a car parking space is required. |
room_type_reserved | The type of room reserved. |
lead_time | Number of days before the arrival date the booking was made. |
arrival_year | Year of arrival. |
arrival_month | Month of arrival. |
arrival_date | Date of the month for arrival. |
market_segment_type | How the booking was made. |
repeated_guest | Whether the guest has previously stayed at the hotel. |
no_of_previous_cancellations | Number of previous cancellations. |
no_of_previous_bookings_not_canceled | Number of previous bookings that were canceled. |
avg_price_per_room | Average price per day of the booking. |
no_of_special_requests | Count of special requests made as part of the booking. |
booking_status | Whether the booking was cancelled or not. |
Source (data has been modified): https://www.kaggle.com/datasets/ahsan81/hotel-reservations-classification-dataset
import plotly.graph_objects as go
import plotly.express as px
import matplotlib.pyplot as plt
import pandas as pd
hotels = pd.read_csv("data/hotel_bookings.csv")
hotels
Hidden code
hotels.isna().sum()
1 hidden cell
#hotels['arrival_year'] = pd.to_datetime(hotels['arrival_year'])
hotels.value_counts('no_of_adults')
fig = px.histogram(data_frame = hotels,
x='no_of_adults',title='number of adults and each percentage',color='no_of_adults')
fig.show()
fig = px.histogram(data_frame = hotels,
x='no_of_children',title='number of childern and each percentage',color='no_of_children')
fig.show()
fig = px.histogram(data_frame = hotels,
x='no_of_weekend_nights',title='number of weekend nights and each percentage',color='no_of_weekend_nights')
fig.show()
fig = px.histogram(data_frame = hotels,
x='no_of_week_nights',title='number of weekend nights and each percentage',color='no_of_week_nights')
fig.show()
fig = px.histogram(data_frame = hotels,
x='type_of_meal_plan',title='number of weekend nights and each percentage',color='type_of_meal_plan')
fig.show()
fig = px.histogram(data_frame = hotels,
x='required_car_parking_space',title='number of weekend nights and each percentage',color='required_car_parking_space')
fig.show()