Pular para o conteúdo principal
InícioPython

Projeto

Find Movie Similarity from Plot Summaries

Iniciante
Actualizado 09/2024
Use NLP and clustering on movie plot summaries from IMDb and Wikipedia to quantify movie similarity.
Iniciar projeto gratuitamente

Incluído comPremium or Teams

PythonData ManipulationData VisualizationMachine LearningProbability & Statistics45 minutos12 Tasks1,500 XP6,961

Crie sua conta gratuita

ou

Ao continuar, você aceita nossos Termos de Uso, nossa Política de Privacidade e que seus dados são armazenados nos EUA.
Group

Treinar 2 ou mais pessoas?

Tentar DataCamp for Business

Amado por alunos de milhares de empresas

Descrição do Projeto

Find Movie Similarity from Plot Summaries

Natural Language Processing (NLP) is an exciting field of study for data scientists where they develop algorithms that can make sense out of conversational language used by humans. In this Project, you will use NLP to find the degree of similarity between movies based on their plots available on IMDb and Wikipedia.The dataset contains the titles of the top 100 movies on IMDb as well as each movie's plot summary from both IMDb and Wikipedia.

Find Movie Similarity from Plot Summaries

Use NLP and clustering on movie plot summaries from IMDb and Wikipedia to quantify movie similarity.
Iniciar projeto gratuitamente
  • 1

    Import and observe dataset

  • 2

    Combine Wikipedia and IMDb plot summaries

  • 3

    Tokenization

  • 4

    Stemming

  • 5

    Club together Tokenize & Stem

  • 6

    Create TfidfVectorizer

  • 7

    Fit transform TfidfVectorizer

  • 8

    Import KMeans and create clusters

  • 9

    Calculate similarity distance

  • 10

    Import Matplotlib, Linkage, and Dendrograms

  • 11

    Create merging and plot dendrogram

  • 12

    Which movies are most similar?

Junte-se a mais 16 milhões de alunos e comece Find Movie Similarity from Plot Summaries hoje!

Crie sua conta gratuita

ou

Ao continuar, você aceita nossos Termos de Uso, nossa Política de Privacidade e que seus dados são armazenados nos EUA.