Skip to main content

Course

Predictive Analytics using Networked Data in R

IntermediateSkill Level

4.8+

Updated 09/2020

Learn to predict labels of nodes in networks using network learning and by extracting descriptive features from the network

Start Course for Free

RProbability & Statistics

4 hr

14 videos

56 Exercises

4,300 XP

4,763

Statement of Accomplishment

Loved by learners at thousands of companies

Training a Team?

Try for Business

Course Description

In this course, you will learn to perform state-of-the art predictive analytics using networked data in R. The aim of network analytics is to predict to which class a network node belongs, such as churner or not, fraudster or not, defaulter or not, etc. To accomplish this, we discuss how to leverage information from the network and its underlying structure in a predictive way. More specifically, we introduce the idea of featurization such that network features can be added to non-network features as such boosting the performance of any resulting analytical model. In this course, you will use the igraph package to generate and label a network of customers in a churn setting and learn about the foundations of network learning. Then, you will learn about homophily, dyadicity and heterophilicty, and how these can be used to get key exploratory insights in your network. Next, you will use the functionality of the igraph package to compute various network features to calculate both node-centric as well as neighbor based network features. Furthermore, you will use the Google PageRank algorithm to compute network features and empirically validate their predictive power. Finally, we teach you how to generate a flat dataset from the network and analyze it using logistic regression and random forests.

Prerequisites

Network Analysis in R Supervised Learning in R: Classification

1

Introduction, networks and labelled networks

In this chapter you will be introduced to labelled networks, network learning and the challanges that can arise.

Motivation: social networks and predictive analytics

Most likely to churn

Create a network from an edgelist

Labeled networks and network learning

Labeling nodes

Coloring nodes

Visualizing Churners

Relational Neighbor Classifier

Challenges of network-based inference

Challenges in Network learning

Probabilistic Relational Neighbor Classifier

Collective Inferencing

2

Homophily

In this chapter you will learn about homophily and how to compute the two measures that can be used to characterice it, dyadicity and heterophilicty.

Homophilic networks

Extracting types of edges

Counting types of edges

Counting nodes and computing connectance

Same label edges

Dyadicity of churners

Dyadicity of non-churners

Heterophilicity

Cross label edges

Compute heterophilicity

Summary of homophily

Dyadicity, Heterophilicity, & Homophily

Is the network homophilic?

3

Network Featurization

In this chapter you will use the igraph package to compute various network features and add them to the network.

Basic Network features

Simple network features

Centrality features

Transitivity

Link-Based Features

Adjacency matrices

Link-based features

Second order link-based features

Neighborhood link-based features

Most influential node

Changes in PageRank

Convergence of PageRank

Personalized PageRank

Extract PageRank features

4

Putting it all together

In this chapter you will use the network from Chapter 3 to create a flat dataset. Using standard data mining techniques, you will build predictive models and measure their performance with AUC and top decile lift.

Extract a dataset

Getting a flat dataset

Missing Values

Replace missing values

Correlated variables

Building a predictive model

Split into train and test

Logistic regression model

Random forest model

Evaluating model performance

Predicting churn

Measure AUC

Measure top decile lift

Summary and final thoughts

Predictive Analytics using Networked Data in R

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance reviewEnroll Now

Don’t just take our word for it

*4.8

from 32 reviews

81%

19%

0%

0%

0%

Sort by

Jose Antonio

2 days ago

Ołena

4 weeks ago

abe

last month

Nour

2 months ago

Kameron

3 months ago

Tung

5 months ago

.

Jose Antonio

Ołena

abe

FAQs

Is this course suitable for beginners?

No. This course is aimed at Intermediate learners with work experience in R. We recommend first taking the "Network Analysis in R" and "Network Analysis in R" courses.

What topics does the course cover?

This course covers topics such as labelled networks, network learning, homophily, network featurization, and how to generate a flat dataset from the network for analytics.

What tools will I learn to use?

You will use the igraph package to generate and label a network of customers in a churn setting and learn about the foundations of network learning. You will also use the Google PageRank algorithm to compute network features.

Will I receive a certificate at the end of the course?

Yes, if you successfully complete all the course requirements, you will receive a certificate of completion.

Who will benefit from this course?

Professionals in data science, artificial intelligence, machine learning, network analysis, and predictive analytics roles would benefit from taking this course.

Does the course contain any exercises?

Yes, the course contains several hands-on exercises that require you to apply the concepts you have learned to a real-life scenario.

Are there any prerequisites for this course?

Yes, familiarity with R and taking the "Network Analysis in R" and "Network Analysis in R" courses is recommended.

Join over 19 million learners and start Predictive Analytics using Networked Data in R today!

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.