Skip to content
Ejercicios Joining
Practica para capitulo de Joining
import pandas as pd
df_movies = pd.read_pickle('movies.p')
df_ratings = pd.read_pickle('ratings.p')
print(df_movies.shape)
print(df_ratings.shape)
print(df_movies.head())
print(df_ratings.head())
ADD RATING TO DATA SET
movies = df_movies.merge(df_ratings, on = 'id', how = 'left')
print(movies.head())
print(movies.columns)
Most Popular
pop = movies.sort_values('popularity', ascending = False)
print(pop[['id','title','popularity']].head())
Verificar anos disponibles
print(movies[['release_date']].sort_values('release_date', ascending = False).head())
** Seleccionar solo peliculas de 2017 - ultimo ano **
movies['release_date'] = pd.to_datetime(movies['release_date'])
movies['year'] = movies['release_date'].dt.year
print(movies.head())
print(movies.shape)
print(movies[movies['year'] == 2005])
df_2005 = movies[movies['year'] == 2005]
df_2005 = df_2005.reset_index(drop=True)
print(df_2005.sort_values('popularity', ascending=False).head())
print(df_2005.columns)
Anadir Tagline dataset con un left join. Esto para ver cuales faltan incluir Tagline
- Tambien hacer un grupby year y sacar la pelicula con mayor popularidad o votes. usar funcion max para esto.