Have you left a review to express how you feel about a product or a service? And do you have a habit of checking a product’s reviews online before you buy it? This kind of information is valuable not only for you but also for companies. In this course, you will learn how to make sense of the sentiment expressed in various documents. You will use real-world datasets featuring tweets, movie and product reviews, and use Python’s nltk and scikit-learn packages. By the end of the course, you will be able to carry an end-to-end sentiment analysis task based on how US airline passengers expressed their feelings on Twitter.
Sentiment Analysis Nuts and BoltsFree
Have you ever checked the reviews or ratings of a product or a service before you purchased it? Then you have very likely came face-to-face with sentiment analysis. In this chapter, you will learn the basic structure of a sentiment analysis problem and start exploring the sentiment of movie reviews.Welcome!50 xpElements of a sentiment analysis problem50 xpHow many positive and negative reviews are there?100 xpLongest and shortest reviews100 xpSentiment analysis types and approaches50 xpDetecting the sentiment of Tale of Two Cities100 xpComparing the sentiment of two strings100 xpWhat is the sentiment of a movie review?100 xpLet's build a word cloud!50 xpYour first word cloud100 xpWhich words are in the word cloud?50 xpWord Cloud on movie reviews100 xp
Numeric Features from Reviews
Imagine you are in the shoes of a company offering a variety of products. You want to know which of your products are bestsellers and most of all - why. We embark on step 1 of understanding the reviews of products, using a dataset with Amazon product reviews. To that end, we transform the text into a numeric form and consider a few complexities in the process.Bag-of-words50 xpWhich statement about BOW is true?50 xpYour first BOW100 xpBOW using product reviews100 xpGetting granular with n-grams50 xpSpecify token sequence length with BOW100 xpSize of vocabulary of movies reviews100 xpBOW with n-grams and vocabulary size100 xpBuild new features from text50 xpTokenize a string from GoT100 xpWord tokens from the Avengers100 xpA feature for the length of a review100 xpCan you guess the language?50 xpIdentify the language of a string100 xpDetect language of a list of strings100 xpLanguage detection of product reviews100 xp
More on Numeric Vectors: Transforming Tweets
This chapter continues the process of understanding product reviews. We will cover additional complexities, especially when working with sentiment analysis data from social media platforms such as Twitter. We will also learn other ways to obtain numeric features from the text.Stop words50 xpWord cloud of tweets100 xpAirline sentiment with stop words100 xpMultiple text columns100 xpCapturing a token pattern50 xpSpecify the token pattern100 xpString operators with the Twitter data100 xpMore string operators and Twitter100 xpStemming and lemmatization50 xpStems and lemmas from GoT100 xpStem Spanish reviews100 xpStems from tweets100 xpTfIdf: More ways to transform text50 xpYour first TfIdf100 xpTfIdf on Twitter airline sentiment data100 xpTfidf and a BOW on same data100 xp
Let's Predict the Sentiment
We employ machine learning to predict the sentiment of a review based on the words used in the review. We use logistic regression and evaluate its performance in a few different ways. These are some solid first models!Let's predict the sentiment!50 xpLogistic regression of movie reviews100 xpLogistic regression using Twitter data100 xpDid we really predict the sentiment well?50 xpBuild and assess a model: movies reviews100 xpPerformance metrics of Twitter data100 xpBuild and assess a model: product reviews data100 xpLogistic regression: revisited50 xpPredict probabilities of movie reviews100 xpProduct reviews with regularization100 xpRegularizing models with Twitter data100 xpBringing it all together50 xpStep 1: Word cloud and feature creation100 xpStep 2: Building a vectorizer100 xpStep 3: Building a classifier100 xpWrap up50 xp
In the following tracksNatural Language Processing in Python
PrerequisitesPython Data Science Toolbox (Part 2)
Violeta MishevaSee More
Violeta is a data scientist passionate about machine learning, natural language processing and fair and explainable algorithms, among others. She supplements her machine learning knowledge with her doctorate in applied econometrics and likes working on complex problems that require multi-disciplinary expertise. She regularly presents projects and initiatives she has worked on at conferences and is an advocate for diversity in the tech industry.