Skip to content
0

Which YouTubers for your brand?

🎯 Executive summary

Videos to be promoted should be engaging and relevant to the content of the platform. Through a system of ranking, the following videos are chosen:

🥇 Learn Python - Full Course for Beginners [Tutorial]
🥈 Python Tutorial - Python Full Course for Beginners
🥉 How I Learnt Machine Learning In 6 Steps (3 months)

In order to identify these top candidates, we will first investigate the Video and Comment tables. Then, we will develop a system to rank the videos.

💡 Video insights


1 hidden cell
Hidden code
  • Some records share the same video ID and publication time which means that they are indeed duplicates. However, some of these records have different keywords and stats. To avoid issues when merging Videos and Comments tables, we'll average the Likes, Comments, and Views of these duplicates, and assign them a new Keyword - 'Duplicate ID'.
Hidden code

1 hidden cell
  • Videos with no ID ('#NAME?') cannot be merged with the Comments table. Instead of dropping these records, we will keep them for their stats but exclude their comments.
Hidden code
  • Reviewing videos with impressive stats in each category, we easily found that some of them were misclassified. For example, many music videos were marked as "animals" or "bed", even as "history". We will reclassify them.

1 hidden cell

Let's visualize the stats for each keyword after reclassifying. We can see that music and mrbeast are the two keywords with the most likes and views.

f, axs = plt.subplots(1, 2, figsize=(10,8))
sns.barplot(data=video, y="Keyword", x="Likes",hue='Keyword', ax=axs[0], legend=None)
sns.barplot(data=video, y="Keyword", x="Views",hue='Keyword', ax=axs[1], legend=None)
f.tight_layout()
plt.show()

When people like a video, they tend to watch it over and over, hit the Like button, and sometimes comment on it (not necessarily all or in this order). We can see that while watching and commenting can be more than once, liking a video is limited to once. According to Youtube, "Like" is a feature they use to suggest the next video to users. It is therefore worth to look into how many people will like or comment on a video after watching it.

The following additional features will be calculated for each video:

  • Likes per View (LPV)
  • Comments per View (CPV)

When looking at CPV and LPV, the variances among keywords are much narrower than just looking at Likes and Views as we saw in the last plot. However, it is also worth noted that a higher LPV or CPV does not always mean the video gets more interest from viewers compared to other videos with lower LPV or CPV. It could be due to such videos have a larger number of views. CPV and LPV should only be used as complementary features.

Hidden code
‌
‌
‌