The Android App Market on Google Play UNGUIDED

. Introduction

Google Play logo

Mobile apps are everywhere. They are easy to create and can be very lucrative from the business standpoint. Specifically, Android is expanding as an operating system and has captured more than 74% of the total market^[1].

The Google Play Store apps data has enormous potential to facilitate data-driven decisions and insights for businesses. In this notebook, we will analyze the Android app market by comparing ~10k apps in Google Play across different categories. We will also use the user reviews to draw a qualitative comparision between the apps.

The dataset you will use here was scraped from Google Play Store in September 2018 and was published on Kaggle. Here are the details:

datasets/apps.csv

This file contains all the details of the apps on Google Play. There are 9 features that describe a given app.

App: Name of the app
Category: Category of the app. Some examples are: ART_AND_DESIGN, FINANCE, COMICS, BEAUTY etc.
Rating: The current average rating (out of 5) of the app on Google Play
Reviews: Number of user reviews given on the app
Size: Size of the app in MB (megabytes)
Installs: Number of times the app was downloaded from Google Play
Type: Whether the app is paid or free
Price: Price of the app in US$
Last Updated: Date on which the app was last updated on Google Play

datasets/user_reviews.csv

This file contains a random sample of 100 [most helpful first](https://www.androidpolice.com/2019/01/21/google-play-stores-redesigned-ratings-and-reviews-section-lets-you-easily-filter-by-star-rating/) user reviews for each app. The text in each review has been pre-processed and passed through a sentiment analyzer.

App: Name of the app on which the user review was provided. Matches the `App` column of the `apps.csv` file
Review: The pre-processed user review text
Sentiment Category: Sentiment category of the user review - Positive, Negative or Neutral
Sentiment Score: Sentiment score of the user review. It lies between [-1,1]. A higher score denotes a more positive sentiment.

From here on, it will be your task to explore and manipulate the data until you are able to answer the three questions described in the instructions panel.

# Use this cell to begin your analysis, and add as many as you would like!
# import pandas and numpy with their usual allias

import pandas as pd 
import numpy as np

Hidden output

1.Data Cleaning

# Importe and explore the file apps.csv  
app_df= pd.read_csv('datasets/apps.csv')
app_df.info()
app_df.head()

# convert the Installs column into an integer type
# iterate trough the Installs column and replace the characters ',' and '+' by '' 
installs = []
for a in app_df['Installs']:
        b= a.replace(',','')
        d= b.replace('+','')
        installs.append(d)
        
# drop the Installs column and reemplace it with the defined list         
app_df.drop(columns='Installs')   
app_df['Installs']= [int(i) for i in installs] 
apps = app_df

# display the new dataframe
apps.info()
apps.head()

2.Summarize Data

# Group each app by its category then count the number of apps per category
app_category= apps.groupby('Category').agg({'App':'count'})

# rename column
app_category.columns= ['Number of apps']

# compute and select the average price and average rating per category
app_cat= pd.pivot_table(apps, index=['Category'], values=['Price', 'Rating'], aggfunc= np.mean)

# rename column
app_cat.columns= ['Average price','Average rating']

# Combine the two dataframe
app_category_info= app_category.merge(app_cat, on='Category' ) 

# display the result
app_category_info.head()

3. Data Filtering

# Importe the file user_reviews.csv
user_rev= pd.read_csv('datasets/user_reviews.csv')

# merge 'apps'  and 'user_rev' as 'Apps_data' , then extract the Free Finance app from 'Apps_data' 
Apps_data= apps.merge(user_rev, on= 'App') 
user_feedback= Apps_data.query("Type=='Free' and Category=='FINANCE'")
user_feedback_sentiment= user_feedback.groupby('App').agg({'Sentiment Score':'mean'})

# sort the Sentiment Score column
top_10_user_feedback= user_feedback_sentiment.sort_values('Sentiment Score', ascending=False)[:10]

#display the result
top_10_user_feedback

The Android App Market on Google Play UNGUIDED

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}. Introduction

1.Data Cleaning

2.Summarize Data

3. Data Filtering

. Introduction