Skip to content
Log from Twin Peaks Christmas
0
  • AI Chat
  • Code
  • Report
  • 💪 Competition challenge

    Create a report that covers the following:

    1. Exploratory data analysis of the dataset with informative plots. It's up to you what to include here! Some ideas could include:
      • Analysis of the genres
      • Descriptive statistics and histograms of the grossings
      • Word clouds
    2. Develop a model to predict the movie's domestic gross based on the available features.
      • Remember to preprocess and clean the data first.
      • Think about what features you could define (feature engineering), e.g.:
        • number of times a director appeared in the top 1000 movies list,
        • highest grossing for lead actor(s),
        • decade released
    3. Evaluate your model using appropriate metrics.
    4. Explain some of the limitations of the models you have developed. What other data might help improve the model?
    5. Use your model to predict the grossing of the following fictitious Christmas movie:

    Title: The Magic of Bellmonte Lane

    Description: "The Magic of Bellmonte Lane" is a heartwarming tale set in the charming town of Bellmonte, where Christmas isn't just a holiday, but a season of magic. The story follows Emily, who inherits her grandmother's mystical bookshop. There, she discovers an enchanted book that grants Christmas wishes. As Emily helps the townspeople, she fights to save the shop from a corporate developer, rediscovering the true spirit of Christmas along the way. This family-friendly film blends romance, fantasy, and holiday cheer in a story about community, hope, and magic.

    Director: Greta Gerwig

    Cast:

    • Emma Thompson as Emily, a kind-hearted and curious woman
    • Ian McKellen as Mr. Grayson, the stern corporate developer
    • Tom Hanks as George, the wise and elderly owner of the local cafe
    • Zoe Saldana as Sarah, Emily's supportive best friend
    • Jacob Tremblay as Timmy, a young boy with a special Christmas wish

    Runtime: 105 minutes

    Genres: Family, Fantasy, Romance, Holiday

    Production budget: $25M

    1. Exploratory data analysis of the dataset with informative plots

    import math
    import re
    from collections import Counter
    from typing import Set, Dict
    
    import pandas as pd
    import numpy as np
    
    import matplotlib.pyplot as plt
    import seaborn as sns
    import plotly.express as px
    import plotly.graph_objects as go
    from wordcloud import WordCloud
    from PIL import Image
    
    from sklearn.model_selection import train_test_split, RandomizedSearchCV
    from sklearn.ensemble import ExtraTreesRegressor
    from sklearn.metrics import mean_squared_error
    from sklearn.preprocessing import MinMaxScaler
    from pycaret.regression import *
    import tensorflow as tf
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense, Dropout
    import nltk
    from nltk.sentiment.vader import SentimentIntensityAnalyzer
    nltk.download('vader_lexicon')
    

    Input

    xmas_movies = pd.read_csv('data/christmas_movies.csv')
    movie_budgets = pd.read_csv('data/movie_budgets.csv')
    xmas_movies.head(3)

    1.1 Analysis of the genres in x-mas Movies

    Hidden code

    In our analysis of Christmas-themed movies, we observed a remarkable diversity in the genre spectrum. These holiday films encompass an impressive array of 26 unique genres. This wide range of genres signifies the broad appeal and popularity of Christmas movies, highlighting their ability to resonate across various cinematic styles and audience preferences.

    Hidden code

    An analysis of Christmas movie genres reveals a distinct preference for heartwarming and jovial themes during the holiday season. Comedy leads the charge as the most popular genre with a count of 452, underscoring the desire for humor and light-heartedness. It's closely followed by Drama at 414 and Romance at 385, both of which highlight the seasonal trend towards emotive storytelling that captures the spirit of love and family. The Family genre, with 282 occurrences, also resonates strongly, suggesting that viewers seek out films that can be enjoyed collectively by all ages. While genres like Fantasy and Adventure offer a sense of escapism with counts of 91 and 47 respectively, the lower frequency of genres such as Sci-Fi, Western, and War, each scoring under 10, indicates a lesser inclination towards more intense or niche film experiences during the Christmas period

    xmas_movies = xmas_movies[pd.notnull(xmas_movies['genre'])]
    xmas_movies['main_genre'] = xmas_movies['genre'].apply(lambda x: x.split(',')[0].strip())
    genre_gross_mean = xmas_movies.groupby('main_genre')['gross'].mean().reset_index()
    average_gross = genre_gross_mean['gross'].mean()
    average_row = pd.DataFrame({'main_genre': ['AVG'], 'gross': [average_gross]})
    genre_gross_mean = pd.concat([genre_gross_mean, average_row], ignore_index=True)
    top_genres = genre_gross_mean.sort_values(by='gross', ascending=False).head(9)
    colors = ['chartreuse' if genre == 'AVG' else '#440154' for genre in top_genres['main_genre']]
    fig_gross = go.Figure(data=[go.Bar(x=top_genres['main_genre'], y=top_genres['gross'], marker_color=colors, text=round(top_genres['gross'], 0))])
    fig_gross.update_layout(title='Top 9 Average Gross by Genre', xaxis_title='Main Genre', yaxis_title='Average Gross',)
    fig_gross.show()

    We can observe that the 'Action' genre leads with a significant margin, indicating a strong preference for high-adrenaline content during the festive season. 'Animation' and 'Adventure' genres also perform well, likely due to their family-friendly appeal, which aligns with holiday viewing habits. Notably, the genre categorized as 'AVG' represents the average gross across all genres and stands out within the top performers, suggesting a fairly robust performance across the board. Genres like 'Drama' and 'Comedy' maintain a solid presence, reflecting their traditional appeal, whereas 'Biography' and 'Horror' appear to have a niche audience given their lower average gross figures. This data underscores the diverse cinematic tastes that emerge during Christmas, with a clear leaning towards genres that offer escapism and align with the spirited atmosphere of the season.