Skip to main content

Python Global Interpreter Lock Tutorial

Learn what Global Interpreter Lock is, how it works, and why you should use it.
Mar 2020  · 8 min read

If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.

A global interpreter lock (GIL) is a mechanism to apply a global lock on an interpreter. It is used in computer-language interpreters to synchronize and manage the execution of threads so that only one native thread (scheduled by the operating system) can execute at a time.

In a scenario where you have multiple threads, what can happen is that both the thread might try to acquire the memory at the same time, and as a result of which they would overwrite the data in the memory. Hence, arises a need to have a mechanism that could help prevent this phenomenon.

Some popular interpreters that have GIL are CPython and Ruby MRI. As most of you would know that Python is an interpreted language, it has various distributions like CPython, Jython, IronPython. Out of these, GIL is supported only in CPython, and it is also the most widely used implementation of Python. CPython has been developed in both C and Python language primarily to support and work with applications that have a lot of C language underneath the hood.

Even if your processor has multiple cores, a global interpreter will allow only one thread to be executed at a time. This is because, when a thread starts running, it acquires the global interpreter lock. When it waits for any I/O operation ( reading/writing data from/to disk ) or a CPU bound operation ( vector/matrix multiplication ), it releases the lock so that other threads of that process can run. Hence, it prevents you from running the other threads at the same time.

thread

Let's take a moment and understand the above diagram. As you can see, there is a factorial function and two threads 1 and 2, thread 1 is in the locked state while thread 2 is in a wait state. This means that only one of the threads is able to access the function. Now, let's assume that the factorial function takes 2 seconds to complete. Then, in an ideal case, both the threads should be able to finish the execution in 2 seconds. However, this is not the case in Python, both the threads will run serially and not parallel to each other.

The threads 1 and 2 calling the factorial function may take twice as much time as a single thread calling the function twice. This also tells you that the memory manager governed by the interpreter is not thread-safe, which means that multiple threads fail to access the same shared data simultaneously.

Hence, GIL (Source: Understanding the Python GIL):

  • Limits the threading operation.

  • Parallel execution is restricted.

  • The GIL ensures that only one thread runs in the interpreter at once.

  • It helps in simplifying various low-level details like memory management.

With GIL cooperative computing or coordinated multitasking is achieved instead of parallel computing.

gil
Source</a

As it can be observed from the above diagram, there are three threads initially the thread 1 is running and it has acquired the GIL, and when an I/O operation like a read, write, etc. is done, thread 1 releases the GIL and it is then acquired by thread 2. This cycle keeps on going, and GIL keeps changing threads alternatively till the threads have completed the execution of the program. Remember that if any of the threads that do not have the lock and have not completed the execution, they will then be in a waiting state.

GIL in Python2 vs. Python3
gil

Global Interpreter Lock in Python 2.7 works differently when compared to Python3. In a pure CPU-bound operation, the thread continues to run since there is no I/O operation in which case the other threads will be in an idle or a wait state, which is not what you would want. To solve this issue, Python2 has terminology known as ticks. The global interpreter lock performs a check to monitor the state of thread-like wait state, I/O operation, or whether it is being run. However, GIL does not keep monitoring the threads at every instance or time, rather it uses the concept of ticks, and it checks the threads at every 100 ticks.

A tick is a byte-code instruction, and when the 100 byte-code instructions are completed, the GIL checks whether the threads are running or in a waiting state. It is important to remember that a tick is not related to time since it is a byte-code instruction. Each tick might take a longer or shorter time to run compared to the other.

gil
Source</a

This periodic check of monitory the threads after every 100 ticks is essential, especially in CPU bound operations since they do not have any I/O operations. Note that you can modify the tick counter using the built-in sys module of Python.

gil
Source</a

Let's discuss some of the demerits of the above approach:

Now, if you look at this diagram, you may notice that the threads are not running parallel to each other but in a serial order, also known as cooperative computing governed by the global interpreter lock. Also, the threads have to wait for longer durations. For example, the thread T3 had to wait for both the threads T1 & T2 to release the GIL. Hence, the threads starve for the lock and the compute.

Another demerit is that there is a battle between the threads as to which thread will acquire the GIL. Initially, it might be the case that based on the priority T1 would have got the GIL, then T2, and so on. However, when T3 releases GIL, it will send a notification to all other threads, in which case all other threads will keep fighting to acquire the GIL. Having said that, you could potentially overcome this by setting the priorities of each thread.

In Python3, you have fixed-time allocated for each thread, i.e., 5ms of execution time. This prevents the threads from starving for resources since the waiting time is equal for all the threads. The threads will wait to see if the GIL gets released by Thread 1 on purpose because of I/O or sleep operation. If not, it will force it to release after 5ms. Hence, there will not be any starvation of CPU time since every thread will get an equal amount of time, and this will eliminate the challenge of acquiring the GIL.

gil

However, it is important to note that the new implementation of GIL gives more priority to CPU-bound jobs as compared to the I/O operations. In a case where a thread has released the GIL, and two threads are waiting, GIL will be acquired by the thread which has a requirement of running a CPU-bound job as compared to the other thread.

Conclusion

Congratulations on finishing the tutorial.

This tutorial covered an advanced topic, and the purpose was to give you all a theoretical overview of how exactly GIL works in Python. Since this tutorial did not cover any coding aspects of GIL and threads, one good exercise for you all would be to experiment on your own by creating multiple threads and a method that performs an I/O intensive task and finally analyzes how the GIL is switched between each thread.

Please feel free to ask any questions related to this tutorial in the comments section below.

References:

If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.

Introduction to Python

Beginner
4 hours
4,596,316
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course

Intermediate Python

Beginner
4 hours
883,210
Level up your data science skills by creating visualizations using Matplotlib and manipulating DataFrames with pandas.

Python Data Science Toolbox (Part 2)

Beginner
4 hours
225,457
Continue to build your modern Data Science skills by learning about iterators and list comprehensions.
See all coursesRight Arrow
Related

The 23 Top Python Interview Questions & Answers

Essential Python interview questions with examples for job seekers, final-year students, and data professionals.
Abid Ali Awan's photo

Abid Ali Awan

22 min

Working with Dates and Times in Python Cheat Sheet

Working with dates and times is essential when manipulating data in Python. Learn the basics of working with datetime data in this cheat sheet.
DataCamp Team's photo

DataCamp Team

Plotly Express Cheat Sheet

Plotly is one of the most widely used data visualization packages in Python. Learn more about it in this cheat sheet.
DataCamp Team's photo

DataCamp Team

0 min

Getting started with Python cheat sheet

Python is the most popular programming language in data science. Use this cheat sheet to jumpstart your Python learning journey.
DataCamp Team's photo

DataCamp Team

8 min

Python pandas tutorial: The ultimate guide for beginners

Are you ready to begin your pandas journey? Here’s a step-by-step guide on how to get started. [Updated November 2022]
Vidhi Chugh's photo

Vidhi Chugh

15 min

Python Iterators and Generators Tutorial

Explore the difference between Python Iterators and Generators and learn which are the best to use in various situations.
Kurtis Pykes 's photo

Kurtis Pykes

10 min

See MoreSee More