YouTube Trends Analysis
Executive Summary
Project Overview
As a data scientist at a global marketing agency, our goal was to identify the most effective YouTube videos for promoting our clients’ brands. The analysis focused on understanding YouTube video trends, performing sentiment analysis on video comments, and developing a video ranking model. Specifically, we aimed to recommend videos for an E-Learning platform focused on Data and AI skills.
Data Summary
- Videos Stats: Contained aggregated data for each YouTube video, including views, likes, comments, and keywords.
- Comments: Included details about comments made on YouTube videos, such as the comment text, likes, and sentiment scores.
Key Analyses and Findings
-
Exploratory Data Analysis of YouTube Trends:
- Analyzed video trends across different industries based on keywords.
- Identified top-performing industries by average views, with "Google," "Animals," and "MrBeast" leading in viewership.
-
Sentiment Analysis of Video Comments:
- Merged comments data with video stats to include industry information.
- Visualized sentiment trends, finding that industries like "MrBeast," "Animals," and "Music" had higher average sentiment scores, indicating positive viewer perceptions.
-
Development of a Video Ranking Model:
- Normalized engagement metrics (views, likes, comments) and sentiment scores.
- Calculated a composite score to rank videos based on overall engagement and sentiment.
- Identified top-ranked videos, highlighting those with high engagement and positive sentiment.
-
Strategic Recommendation for E-Learning Collaboration:
- Filtered videos related to Data and AI skills.
- Recommended the top three videos for an E-Learning platform:
- "How I Learnt Machine Learning In 6 Steps (3 months)": Highly relevant for learning machine learning with a composite score of 0.870.
- "I bought the BIGGEST Tech in the world.": Showcases cutting-edge technology, appealing to tech enthusiasts with a composite score of 0.800.
- "I bought the THINNEST Tech in the world.": Features innovative tech trends, suitable for an audience interested in the latest advancements with a composite score of 0.786.
Conclusion
By leveraging data analytics and sentiment analysis, we identified highly engaging and positively perceived YouTube videos. The recommended videos are strategically aligned with the content focus of an E-Learning platform on Data and AI skills, ensuring maximum relevance and potential impact.
2 hidden cells
1) Exploratory Data Analysis of YouTube Trends
# Display summary statistics for videos_stats dataframe
videos_stats_summary = videos_stats.describe()
# Display summary statistics for comments dataframe
comments_summary = comments.describe()
videos_stats_summarycomments_summary# Identify rows with negative values in Likes or Comments
negative_likes_comments = videos_stats[(videos_stats['Likes'] < 0) | (videos_stats['Comments'] < 0)]
# Count of rows with negative values
negative_likes_comments_count = negative_likes_comments.shape[0]
negative_likes_comments# Remove rows with negative values in Likes or Comments
videos_stats_cleaned = videos_stats[(videos_stats['Likes'] >= 0) & (videos_stats['Comments'] >= 0)]
# Verify the rows have been removed
videos_stats_cleaned.shape[0], videos_stats.shape[0] - videos_stats_cleaned.shape[0] # (remaining rows, removed rows)
videos_stats_cleanedvideos_stats_cleaned.describe()Graph: Distribution of Videos by Industry
The bar chart above shows the distribution of videos by industry (keyword). The most common keywords (industries) have a higher number of videos associated with them.
Next, let's calculate and visualize the average engagement metrics (views, likes, and comments) for videos in each industry.
Graph: Average Views, Likes and Comments by Industry