Tutorials
python

Python Dictionaries: The Definitive Guide

Learn all about the Python Dictionary and its potential. You will also learn how to create word frequency using the Dictionary.

If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.

Dictionary is a built-in Python Data Structure that is mutable. It is similar in spirit to List, Set, and Tuples. However, it is not indexed by a sequence of numbers but indexed based on keys and can be understood as associative arrays. On an abstract level, it consists of a key with an associated value. In Python, the Dictionary represents the implementation of a hash-table.

You might wonder what are keys?

Well, as shown in the figure below, keys are immutable ( which cannot be changed ) data types that can be either strings or numbers. However, a key can not be a mutable data type, for example, a list. Keys are unique within a dictionary and can not be duplicated inside a dictionary, in case if it is used more than once then subsequent entries will overwrite the previous value.

Key connects with the value, hence, creating a map-like structure. If you for a second, remove keys from the picture, all you are left with is a data structure containing a sequence of numbers. Dictionaries, therefore, hold a key: value pair at each position.

A dictionary is represented by a pair of curly braces {} in which enclosed are the key: value pairs separated by a comma.

Let's look at the syntax of a dictionary: dictionary = {"key_1": "value_1", "key_2": "value_2", "key_3": "value_3"}#

Unique Keys

Since now you know that keys in a dictionary have to be unique, let's understand it with the help of an example.

dictionary_unique = {"a": "alpha", "o": "omega", "g": "gamma"}

Let's print this out.

print(dictionary_unique)
{'a': 'alpha', 'o': 'omega', 'g': 'gamma'}

Great, so until now, everything looks fine. You were able to print your first dictionary output. Now, let's repeat the key g with a new value and see what happens.

dictionary_unique = {"a": "alpha", "o": "omega", "g": "gamma", "g": "beta"}
print(dictionary_unique)
{'a': 'alpha', 'o': 'omega', 'g': 'beta'}

As expected, the key g previous value gamma was overwritten by value beta.

Immutable Keys

Now let's see what happens when you try to define the key as a mutable data type.

dictionary_immutable = {["a","b","c"]: "alpha", "o": "omega", "g": "gamma", "g": "beta"}
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-16-78a55d50cd65> in <module>
----> 1 dictionary_immutable = {["a","b","c"]: "alpha", "o": "omega", "g": "gamma", "g": "beta"}


TypeError: unhashable type: 'list'

From the above output, you can observe that defining the first key of the dictionary as a list results in a TypeError since dictionary keys must be immutable types and list is a mutable type.

However, there is a workaround to it, which is, replacing the list to a tuple, since a tuple is an immutable data type.

dictionary_immutable = {("a","b","c"): "alpha", "o": "omega", "g": "gamma", "g": "beta"}
dictionary_immutable
{('a', 'b', 'c'): 'alpha', 'o': 'omega', 'g': 'beta'}

Accessing Keys and Values

Since you now know how to create dictionaries, let's learn to access the keys and the values from the dictionary. To access the key value pair, you would use the .items() method, which will return a list of dict_items in the form of a key, value tuple pairs.

dictionary_unique.items()
dict_items([('a', 'alpha'), ('o', 'omega'), ('g', 'beta')])

To access keys and values separately, you could use a for loop on the dictionary or the .keys() and .values() method.

for key, value in dictionary_unique.items():  #accessing keys
    print(key,end=',')
a,o,g,
for key, value in dictionary_unique.items():  #accessing values
    print(value,end=',')
alpha,omega,beta,
dictionary_unique.keys() #accessing keys without for loop
dict_keys(['a', 'o', 'g'])
dictionary_unique.values()  #accessing values without for loop
dict_values(['alpha', 'omega', 'beta'])

You could even access a value by specifying a key as a parameter to the dictionary.

dictionary_unique['a']
'alpha'
dictionary_unique['g']
'beta'

Nested Dictionary

Creating a nested dictionary is quite simple. In the nested dictionary, you pass in a dictionary inside a dictionary or to put it simply; a dictionary is passed as a value to a key of the main dictionary.

Here you will create a datacamp dictionary inside a nested dictionary dictionary_nested.

dictionary_nested = {"datacamp":{"Deep Learning": "Python", "Machine Learning": "Pandas"},"linkedin":"jobs","nvidia":"hardware"}
dictionary_nested
{'datacamp': {'Deep Learning': 'Python', 'Machine Learning': 'Pandas'},
 'linkedin': 'jobs',
 'nvidia': 'hardware'}
dictionary_nested['datacamp']['Deep Learning']
'Python'
dictionary_nested['datacamp']['Machine Learning']
'Pandas'
dictionary_nested['linkedin']
'jobs'
dictionary_nested['nvidia']
'hardware'

Dictionary Comprehension

Dictionary comprehensions can be used to create dictionaries from arbitrary key and value expressions. It is a simple and concise way of creating dictionaries and is often faster than the usual for loop implementations.

import time
t1 = time.time()
dict_comprehension = {i: i**3 for i in range(200000)}
print(time.time() - t1)
0.0897526741027832

Let's print the first ten key: value pairs from the dict_comprehension dictionary. To achieve this, you will import islice from the itertools built-in package and specify n as a number of key value pairs you want to extract.

from itertools import islice

comp_10 = list(islice(dict_comprehension.items(),10))
print(comp_10)
[(0, 0), (1, 1), (2, 8), (3, 27), (4, 64), (5, 125), (6, 216), (7, 343), (8, 512), (9, 729)]
import time
t1 = time.time()
dict_comprehension = dict()
for i in range(200000):
    dict_comprehension[i+1] = i**3
print(time.time() - t1)
0.10853934288024902

As you can see from the above two implementations of a dictionary, dictionary comprehension with a minimal margin, is still the winner in terms of the time it takes to run. You would notice that as you keep increasing the range parameter, the difference in the time will also increase.

Word Frequency

From a collection of written texts, in this case, a string of text, also known as corpus, lets us create a word frequency with the help of a dictionary.

corpus = 'learn all about the Python Dictionary and its potential. \
            You would also learn to create word frequency using the Dictionary'
word_freq = dict()
corpus_word = str(corpus).split()
for word in range(len(corpus_ref)):
    if corpus_word[word] not in word_freq:
        word_freq[corpus_word[word]] = 1

    else:
         word_freq[corpus_word[word]] += 1
word_freq
{'learn': 2,
 'all': 1,
 'about': 1,
 'the': 2,
 'Python': 1,
 'Dictionary': 2,
 'and': 1,
 'its': 1,
 'potential.': 1,
 'You': 1,
 'would': 1,
 'also': 1,
 'to': 1,
 'create': 1,
 'word': 1,
 'frequency': 1,
 'using': 1}

Great! So as you can observe from the above output, you were able to get a word count or word frequency from a string of text with the help of a Python Dictionary.

Conclusion

Congratulations on finishing the tutorial.

Please feel free to ask any questions related to this tutorial in the comments section below.

If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.