Skip to main content
HomeTutorialsData Science

AWS EC2 Tutorial For Beginners

Discover why you should use Amazon Web Services Elastic Compute Cloud (EC2) and how you can set up a basic data science environment on a Windows instance.
Dec 2017  · 7 min read

Learn about some of the advantages of using Amazon Web Services Elastic Compute Cloud (EC2). Then, the first part of the tutorial covers how to launch and connect to Windows virtual machines or instances on EC2. The next part goes over how to setup a basic data science environment (install R, RStudio, and Python) on the instance.

Amazon Web Services Elastic Compute Cloud (EC2): A Brief Case

There are times when one is limited by the capabilities of a desktop or laptop. Suppose a data scientist has a large dataset that they would like to do some analysis on. The scientist proceeds to try and load the entire dataset into memory and an error like the one below occurs.

error

The error resulted because the available RAM was exhausted. The operating system couldn't allocate another 500Mb of RAM. While there are many different solutions to this type of problem, one possible solution could be to upgrade the RAM of the computer. Besides having to make an investment in more RAM, there are limits to how far some computers can be upgraded. The potential solution explored in this tutorial is to use a virtual machine in the cloud (AWS) with more RAM and CPU.

Virtual machines on AWS EC2, also called instances, have many advantages. A few of the advantages include being highly scalable (one can choose instances with more RAM, CPU etc), they are easy to start and stop (outside the free tier, customers pay for what they use), and they allow for the selection of different platforms (operating systems). An important point thing to emphasize is that although this tutorial covers how to launch a Windows based virtual machine, there are many different types of virtual machines for many different purposes.

With that, let's get started.

Create an AWS Account and Sign into AWS.

1.On the Amazon Web Services site (here's the link), click on "Sign In to the Console". Sign in if you have account. If you don't, you will need to make one.

create aws account

2.On the EC2 Dashboard, click on EC2.

ec2 dashboard

Create an Instance

3.On the Amazon EC2 console, click on Launch Instance.

launch instance

4.Click on the "Select" button in the row with Microsoft Windows Server 2016 Base. Please note that this will create a Windows based instance instead of a typical Linux based instance. This effects how you will connect to the instance.

choose ami

5.Make sure t2 micro (free instance type) is selected.

choose instance type

and click on "Review and Launch"

choose instance type 2

6.Click on Launch.

review instance launch

7.Select "Create a new key pair". In the box below ("Key pair name"), fill in a key pair name. I named my key DataCampTutorial, but you can name it whatever you like. Click on "Download Key Pair". This will download the key. Keep it somewhere safe.

download key pair

Next, click on "Launch Instances"

launch instances

8.The instance is now launched. Go back to the Amazon EC2 console. I would recommend that you click on what is enclosed in the red rectangle as it will bring you back to the console.

amazon ec2 console

9.Wait till you see that "Instance State" is running before you proceed to the next step. This can take a few minutes.

instance state

Connect to your Instance

10.Click on connect.

connect to instance

11.Click on "Download Remote Desktop File". Save the remote desktop file (rdp) file somewhere safe.

download remote desktop file

12.Click on "Get Password". Keep in mind that you have to wait at least 4 minutes after you launch an instance before trying to retrieve your password.

get password

13.Choose the pem file you downloaded from step 7 and then click "Decrypt Password"

decrypt password

14.After you decrypt your password, save it somewhere safe. You will need it to log into your instance.

instance password

15.Open your rdp file. Click on continue. If your local computer is a Mac, you will need to download "Microsoft Remote Desktop" from the App Store to be able to open your rdp file.

download microsoft remote desktop

16.Enter your password you got from step 14

enter password

After you enter your password, you should see a screen like this

screen

Download Firefox

To be able to install R and/or Python, it really helps to have a browser. The instance comes preinstalled with Internet Explorer with Enhanced Security Configuration enabled which can be difficult to work with. Download firefox as an alternative browser to avoid the enhanced security from Internet Explorer.

  1. Type the following into Internet Explorer https://www.mozilla.org/firefox/new/?scene=2

download firefox

  1. Click on "Add" when you see the popup below.

add

Click on "Add" again.

add

3.When you get to the Firefox page, you may have to click on add a couple times (similar to steps 1 and 2) until the Firefox download starts. If the download doesn't start automatically, then click on "click here".

download

Now that Firefox is installed, be sure to use Firefox as your browser. It will make it a lot simpler than continuously dealing with security issues from Internet Explorer.

Install R and Python

Now that firefox is installed, you can install R and Python as you would on a normal windows machine. If you need help installing, here are some links to guides below.

Stop or Terminate an Instance (Important)

After finishing use of an instance, it is a good idea to stop or terminate the instance. To do this, go to the Amazon EC2 console and click on "Actions" then "Instance State" and you will have the option of either stopping or terminating the instance.

If you plan on using the instance again, stop the instance. If you don't plan on using the instance again, terminate the instance.

While the instance in this tutorial was in the "free tier", I would recommend terminating the instance so you don't forget about it.

Stop or Terminate an Instance

Conclusion

This tutorial provided a quick guide to launching and connecting to EC2 instances as well as how you would go about setting up a basic data science environment. If you would like to continue your EC2 learning, I suggest you check out the tutorial, "Deep Learning with Jupyter Notebooks in the Cloud" which covers how to setup a linux based EC2 GPU instance for deep learning applications. If you any questions or thoughts on the tutorial, feel free to reach out in the comments below or through Twitter.

Related courses

Introduction to AWS Boto in Python

BeginnerSkill Level
4 hr
13.5K
Learn about AWS Boto and harnessing cloud technology to optimize your data workflow.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

DataCamp Portfolio Challenge: Win $500 Publishing Your Best Work

Win up to $500 by building a free data portfolio with DataCamp Portfolio.
DataCamp Team's photo

DataCamp Team

5 min

What is Microsoft Fabric?

Discover how Microsoft Fabric revolutionizes data analytics and learn about how its core features empower businesses to make data-driven decisions.
Kurtis Pykes 's photo

Kurtis Pykes

10 min

How is AI Transforming Data Management?

Explore how AI is transforming data management, from enhancing data extraction and mapping to improving data quality and analysis.

Javeria Rahim

7 min

Building Diverse Data Teams with Tracy Daniels, Head of Insights and Analytics at Truist

Tracy and Richie discuss the best way to approach DE & I in data teams and the positive outcomes of implementing DEI correctly.
Richie Cotton's photo

Richie Cotton

49 min

Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision Scientist

Richie speaks to Google's first Chief Decision Scientist and CEO of Data Scientific, Cassie Kozyrkov, covering decision science, data and AI.
Richie Cotton's photo

Richie Cotton

68 min

Performance and Scalability Unleashed: Mastering Single Table Database Design with DynamoDB

One table to rule them all: simplify, scale, and supercharge your NoSQL database!
Gary Alway's photo

Gary Alway

16 min

See MoreSee More