Start Learning for Free

Join over 500,000 other Data Science learners and start one of our interactive tutorials today!

Topic python small

Python For Data Science - A Cheat Sheet For Beginners

October 12th, 2016 in Python

54% of the respondents of the latest O'Reilly Data Science Salary Survey indicated that they used Python as a data science tool. This is a small increase in comparison to the results of the 2015 survey, where 51% of the respondents indicated to use Python. 

Nobody can deny that Python has been on the rise in the data science industry and it certainly seems that it's here to stay.

This rise in popularity in the industry, the long gone infancy of Python packages for data analysis, the low and gradual learning curve and the fact that it is a fully fledged programming language are only a couple of reasons that make Python an exceptional tool for data science.

Although Python is a very readable language, you might still be able to use some help.

That's enough reason for DataCamp to make a Python cheat sheet for data science, especially for beginners. It can serve as a quick reference for those of you who are just beginning their data science journey or it can serve as a guide to make it easier to learn about and use Python.

This cheat sheet is free additional material that complements DataCamp's Intro to Python for Data Science course, where you learn by doing.

Python For Data Science Cheat Sheet For Beginners

(Above is the printable version of this cheat sheet)

This Python cheat sheet will guide you through variables and data types, Strings, Lists, to eventually land at the fundamental package for scientific computing with Python, Numpy.

Install Python

Download Anaconda

Libraries

Import libraries

import numpy
import numpy as np

Selective import

from math import pi

Asking for Help

>>> help(str)

Variables and Data Types

Variable Assignment

>>> x=5
>>> x
5

Calculations With Variables

>>> x+2 Sum of two variables
7
>>> x-2 Subtraction of two variables
3
>>> x*2 Multiplication of two variables
10
>>> x**2 Exponentiation of a variable
25
>>> x%2 Remainder of a variable
1
>>> x/float(2) Division of a variable
2.5

Types and Type Conversion

str() '5', '3.45', 'True' Variables to strings
int() 5, 3, 1 Variables to integers
float() 5.0, 1.0 Variables to floats
bool() True, True, True Variables to booleans

Strings

>>> my_string = 'thisStringIsAwesome'
>>> my_string
'thisStringIsAwesome'

String Operations

>>> my_string * 2
'thisStringIsAwesomethisStringIsAwesome'
>>> my_string + 'Innit'
'thisStringIsAwesomeInnit'
>>> 'm' in my_string
'True'

Selecting String Characters

(Index starts at 0)

>>> my_string[3]
>>> my_string[4:9]

String Methods

>>> my_string.upper() String to uppercase
>>> my_string.lower() String to lowercase
>>> my_string.count('w') Count String elements
>>> my_string.replace('e', 'i') Replace String elements
>>> my_string.strip() Strip whitespace from ends

Lists

>>> a = 'is'
>>> b = 'nice'
>>> my_list = ['my', 'list', a, b]
>>> my_list2 = [[4,5,6,7], [3,4,5,6]]

Selecting List Elements

(Index starts at 0)

Subset
>>> my_list[1] Select item at index 1
>>> my_list[-3] Select 3rd last item

Slice

>>> my_list[1:3] Select items at index 1 and 2
>>> my_list[1:] Select items after index 0
>>> my_list[:3] Select items before index 3
>>> my_list[:] Copy my_list

Subset Lists of Lists

>>> my_list2[1][0] my_list[list][itemOfList]
>>> my_list2[1][:2]

Lists Operations

>>> my_list + my_list
['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice']
>>> my_list * 2
['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice']
>>> my_list2 > 4
>>> True

List Methods

>>> my_list.index(a) Get the index of an item
>>> my_list.count(a) Count an item
>>> my_list.append('!') Append an item at a time
>>> my_list.remove('!') Remove an item
>>> del(my_list[0:1]) Remove an item
>>> my_list.reverse() Reverse the list
>>> my_list.extend('!') Append an item
>>> my_list.pop(-1) Remove an item
>>> my_list.insert(0,'!') Insert an item
>>> my_list.sort() Sort the list

Numpy Arrays

>>> my_list = [1, 2, 3, 4]
>>> my_array = np.array(my_list)
>>> my_2darray = np.array([[1,2,3],[4,5,6]])

Selecting Numpy Array Elements

(Index Starts at 0)

Subset

>>> my_array[1] Select item at index 1
2

Slice

>>> my_array[0:2] Select items at index 0 and 1
array([1, 2])

Subset 2D Numpy arrays

>>> my_2darray[:,0] my_2darray[rows, columns]
array([1, 4])

Numpy Array Operations

>>> my_array > 3
array([False, False, False, True], dtype=bool)
>>> my_array * 2
array([2, 4, 6, 8])
>>> my_array + np.array([5, 6, 7, 8])
array([6, 8, 10, 12])])

Numpy Array Functions

>>> my_array.shape Get the dimensions of the array
>>> np.append(other_array) Append items to an array
>>> np.insert(my_array, 1, 5) Insert items in an array
>>> np.delete(my_array,[1]) Delete items in an array
>>> np.mean(my_array) Mean of the array
>>> np.median(my_array) Median of the array
>>> my_array.corrcoef() Correlation coefficient
>>> np.std(my_array) Standard deviation

To download this cheat sheet, click below

Python For Data Science Cheat Sheet For Beginners

If you're interested in more cheat sheets, check out our Bokeh cheat sheet for data visualization in Python and our Pandas cheat sheet for data manipulation in Python. 

Do you want to learn more? Complete the Intro to Python for Data Science course for free now! 

Comments

contact-b9287c39-402b-48b5-81b1-f8475f02e51b
hi Karlijn, thank you for the great compilation, but you seem to have a typo in the "Calculations With Variables" section, which 5 % 2 = 1 should be 'reminder' instead of 'division'.
10/17/16 1:31 PM |
karlijn
Thank you very much for mentioning this! I adjusted it :)
10/17/16 8:57 PM |
gaiusjaugustus
This is awesome, thanks. I've seen some on pandas, but no easily printable ones yet. Any suggestions?
10/12/16 9:09 PM |
karlijn
Hi! There are indeed some on pandas; I did see that Quandl has made a NumPy / SciPy / Pandas Cheat Sheet that seems (easily) printable. Maybe this could be of help?
10/17/16 9:10 PM |