Skip to main content

Lists: N-Sized Chunks

In this tutorial, you shall work with lists and learn an efficient way to divide arbitrarily sized lists into chunks of a given size.
Sep 12, 2018  · 5 min read

Lists are inbuilt data structures in Python that store heterogeneous items and enable efficient access to these items. The task at hand, dividing lists into N-sized chunks is a widespread practice when there is a limit to the number of items your program can handle in a single request.

Lists

Lists are data structures that can hold mixed values or items within itself. Examples of items are integers, floats, strings, etc. Lists are mutable, which means you can change the content of a list without actually changing its identity. They are written with square brackets [ ], and the items within it are separated by a comma (,). Let's create a list of numbers that we can work with...

# Creating a list of 95 numbers
list_numbers = list(range(1, 96))
print(list_numbers)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95]

The range() function generates a list of numbers. It takes the form: range([start], stop[, step]) where 'start' and 'step' are optional parameters. start is the starting number in the sequence, stop is the number it generates numbers up to, but not including this number. step is the difference between each number in the sequence. This function is 0 indexed, meaning the list indexes start at 0, not 1 when the start is not specified.

Now, let's say you need to break down the list into smaller lists, each having 5 elements each.

One way to do this is by defining a generator. A generator is an elegant way to define an iterator. What is an iterator you ask? To put into simple words - iterator is an object that knows how to compute and return the next item in the object that you are iterating through. You can read more about iterators and generators in DataCamp's Python Iterator Tutorial.

We define a function that holds the generator that actually does the work for us.

 # Yields successive 'n' sized chunks from list 'list_name'
def create_chunks(list_name, n):
    for i in range(0, len(list_name), n):
        yield list_name[i:i + n]

# Call the 'create_chunks' function to divide the list further into sub-lists of 10 items each
print(list(create_chunks(list_numbers, 10)))

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40], [41, 42, 43, 44, 45, 46, 47, 48, 49, 50], [51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70], [71, 72, 73, 74, 75, 76, 77, 78, 79, 80], [81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95]]

Another way to do the same is to merely use list comprehension. You can read more about it in DataCamp's Python List Comprehension Tutorial.

print([list_numbers[i: i+10] for i in range(0, len(list_numbers), 10)])

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [31, 32, 33, 34, 35, 36, 37, 38, 39, 40], [41, 42, 43, 44, 45, 46, 47, 48, 49, 50], [51, 52, 53, 54, 55, 56, 57, 58, 59, 60], [61, 62, 63, 64, 65, 66, 67, 68, 69, 70], [71, 72, 73, 74, 75, 76, 77, 78, 79, 80], [81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [91, 92, 93, 94, 95]]

As the problem gets more complicated - list comprehension statements can get more and more complex to understand and debug. Thus, writing a clean function such as with the generator can be more useful and easier to keep track of.

In this tutorial, you have learned two ways to solve a rather frequent problem when dealing with lists. Check out DataCamp's Data Types for Data Science course.

Topics

Python courses

Certification available

course

Introduction to Python

4 hr
5.4M
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

Python Lists Tutorial

A list is a compound data type where you can group values together. A Python list is a convenient way of storing information altogether rather than individually.
DataCamp Team's photo

DataCamp Team

4 min

tutorial

Lists in Scala

Learn what lists are and how they can be leveraged in the Scala Programming Language.
Aditya Sharma's photo

Aditya Sharma

6 min

tutorial

Python String Tutorial

In this tutorial, you'll learn all about Python Strings: slicing and striding, manipulating and formatting them with the Formatter class, f-strings, templates and more!
Sejal Jaiswal's photo

Sejal Jaiswal

16 min

tutorial

String Split in Python Tutorial

Learn how you can perform various operations on string using built-in Python functions like split, join and regular expressions.
DataCamp Team's photo

DataCamp Team

2 min

tutorial

Creating a List in R

Practice Lists in R by using course material from DataCamp's Intro to R course.
Ryan Sheehy's photo

Ryan Sheehy

3 min

tutorial

Python List Comprehension Tutorial

Learn how to effectively use list comprehension in Python to create lists, to replace (nested) for loops and the map(), filter() and reduce() functions, ...!
Aditya Sharma's photo

Aditya Sharma

20 min

See MoreSee More