Comparing Cosmetics by Ingredients

Process ingredient lists for cosmetics on Sephora then visualize similarity using t-SNE and Bokeh.

Start Project

11 Tasks1,500 XP

Loved by learners at thousands of companies

Project Description

Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's tought to interpret those ingredient lists unless you have a background in chemistry.

Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. In this Project, you are going to create a content-based recommendation system where the 'content' will be the chemical components of cosmetics. Specifically, you will process ingredient lists for 1472 cosmetics on Sephora via word embedding, then visualize ingredient similarity using a machine learning method called t-SNE and an interactive visualization library called Bokeh.

Project Tasks

1
Cosmetics, chemicals... it's complicated

2
Focus on one product category and one skin type
3
Tokenizing the ingredients
4
Initializing a document-term matrix (DTM)
5
Creating a counter function
6
The Cosmetic-Ingredient matrix!
7
Dimension reduction with t-SNE
8
Let's map the items with Bokeh
9
Adding a hover tool
10
Mapping the cosmetic items
11
Comparing two products

Technologies

Python

Topics

Data Manipulation Data Visualization Machine Learning

Jiwon Jeong

Graduate Research Assistant at Yonsei University

Jiwon is a graduate student majoring in Industrial Engineering at Yonsei University. Her core research area is in developing business strategies with a statistical approach. She has a passion for finding novel applications of machine learning. Outside of school, she is a lover of travel and books and is a healthy living advocate.

FAQs

What do other learners have to say?

Comparing Cosmetics by Ingredients

Loved by learners at thousands of companies

Project Description

Project Tasks

FAQs

Is this project suitable for beginners?

What is the programming language of this project?

Can I add this project to my Data Portfolio?

Do I need to download any software to complete this project?

What do other learners have to say?