Skip to main content

How to Convert Bytes to String in Python

To convert bytes to strings in Python, we can use the .decode() method, specifying the appropriate encoding.
Jun 12, 2024  · 8 min read

One of the lesser-known built-in sequences in Python is bytes, which is an immutable sequence of integers. Each integer represents a byte of data. Data is often transmitted across networks and programs as bytes. Bytes aren't human-readable, and we often need to convert them to strings in our Python programs.

This tutorial explores the techniques of converting bytes to strings in Python. If you're interested in the reverse operation, check out my tutorial on how to convert strings to bytes in Python.

Let's start with a short answer for those of you in a hurry.

Short Answer: How to Convert Bytes to String in Python

The main tool to convert bytes to strings in Python is the .decode() method:

data = b"\xc3\xa9clair"
text = data.decode(encoding="utf-8")
print(text)
éclair

The .decode() method returns a string representing the bytes object. In this example, UTF-8 decoding is used. UTF-8 is the default encoding, so it's not required in the example above, but we can also use other encodings.

Understanding Bytes and Strings in Python

Python has a built-in bytes data structure, which is an immutable sequence of integers in the range 0 to 255. An integer within this range of 256 numbers can be represented using eight bits of data, which is equal to one byte. Therefore, each element in a bytes object is an integer representing one byte.

Let's use the bytes() constructor to create a bytes object from a list:

data = bytes([68, 97, 116, 97, 67, 97, 109, 112])
print(type(data))
<class 'bytes'>

This sequence of integers is immutable:

data[0] = 100
Traceback (most recent call last):
 ...
   data[0] = 100
   ~~~~^^^
TypeError: 'bytes' object does not support item assignment

Python raises a TypeError when we try to change a value within this sequence. Let's confirm that a bytes object can only contain integers in the range from 0 to 255:

data = bytes([254, 255, 256])
Traceback (most recent call last):
 ...
   data = bytes([254, 255, 256])
          ^^^^^^^^^^^^^^^^^^^^^^
ValueError: bytes must be in range(0, 256)

The ValueError states that an integer representing a byte must be a member of range(0, 256). The end of the range is excluded, as is the convention with a range of numbers in Python.

We can display the bytes object using print():

data = bytes([68, 97, 116, 97, 67, 97, 109, 112])
print(data)
b'DataCamp'

The representation shows the bytes object represented by characters. Python displays the integers using the ASCII characters corresponding to those values. The prefix b ahead of the characters in quotes shows that this is a bytes object.

We can also create a bytes object using the bytes literal directly:

data = b"DataCamp"
print(data)
print(data[0])
b'DataCamp'
68

The object is displayed using characters, but its elements are integers.

However, bytes objects are only human-readable when using integers corresponding to printable ASCII characters. Since ASCII is a 7-bit encoding, it contains 128 codes, but only 95 of these ASCII characters are printable.

Integers between 0 and 127 that don't have a printable ASCII character and integers between 128 and 255 are displayed as hexadecimal numbers:

data = bytes([0, 68, 97, 116, 97, 67, 97, 109, 112, 200])
print(data)
b'\x00DataCamp\xc8'

The first integer in the list is now 0, which isn't a printable character in ASCII. The bytes object displays this integer as \x00, the hexadecimal representation of 0. The final integer is 200, which is out of the ASCII range but within the range of integers that can be used in bytes. Python displays this integer as \xc8, which is 200 in hexadecimal.

The bytes and strings data structures share common features. They're both immutable sequences, but bytes contain integers, whereas strings contain characters:

data = bytes([68, 97, 116, 97, 67, 97, 109, 112])
print(data[0])

​text = "DataCamp"
print(text[0])
68
D

Bytes are used to transfer data across networks and between different interfaces. The next section of this article explores how to convert bytes to strings.

Converting Bytes to Strings: The .decode() Method

A bytes object in Python is human-readable only when it contains readable ASCII characters. In most applications, these characters are not sufficient. We can convert a bytes object into a string using the .decode() method:

data = bytes([68, 97, 116, 97, 67, 97, 109, 112])
text = data.decode()

print(text)
print(type(text))
DataCamp
<class 'str'>

However, there are different encodings to convert between bytes and strings. The integers in the example above represent printable ASCII characters. However, the default encoding used when no argument is passed to .decode() is UTF-8 not ASCII. UTF-8 is the most widely used encoding for Unicode characters.

The first 128 UTF-8 codes match the ASCII codes, which is why the bytes object data is decoded to the string "DataCamp" in the example above.

However, Unicode contains nearly 150,000 characters. UTF-8 encodes all the non-ASCII characters in Unicode using two or more bytes.

Let's create a new bytes object and decode it:

data = bytes([195, 169])
print(data)

text = data.decode()
print(text)
b'\xc3\xa9'
é

The bytes object contains two bytes: the integers 195 and 169. Python displays the hexadecimal representation of these bytes. However, .decode() returns a string by decoding the bytes object using the UTF-8 encoding. The pair of bytes b'\xc3\xa9' represent the letter e with an acute accent, é.

The .decode() method takes an optional argument to specify the encoding:

text = data.decode("utf-8")
print(text)
é

UTF-8 is the default encoding, but other encodings can be used. We can also convert a string to a bytes object using the string method .encode() with various encodings:

text = "éclair"
data = text.encode("utf-8")
print(data)

data = text.encode("utf-16")
print(data)
b'\xc3\xa9clair'
b'\xff\xfe\xe9\x00c\x00l\x00a\x00i\x00r\x00'

The same string converts to different bytes objects depending on which character encoding we use.

Encoding Errors

The choice of character encodings can lead to errors if a different encoding is used when converting between bytes and strings. Let's use the example with the string "éclair" encoded to bytes using the UTF-8 encoding:

text = "éclair"
data = text.encode("utf-8")
print(data)
b'\xc3\xa9clair'

This sequence of bytes represents characters using the UTF-8 encoding. The .decode() method raises an error if we use the wrong encoding:

text = data.decode("ascii")
print(text)
Traceback (most recent call last):
 ...
   text = data.decode("ascii")
          ^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

However, we can include the optional errors argument to handle any errors. The default value for the errors parameter is "strict", which raises an error for invalid characters.

The .decode() method can be called with errors="ignore" to ignore all invalid characters:

text = data.decode("ascii", errors="ignore")
print(text)
clair

The first character in the original string is é, which isn't an ASCII character. Therefore, .decode() returns a string without the first character.

Another option is to call .decode() with errors="replace", which replaces invalid bytes with �:

text = data.decode("ascii", errors="replace")
print(text)
��clair

This output shows that characters are missing from the string. The choice of error handling depends on the application.

Converting Bytes to Strings With str()

The .decode() method is the main route to convert bytes into strings. However, it's also possible to use the str() constructor directly. Using the str() constructor on a bytes object will create a string representation of the raw bytes:

data = b'\xc3\xa9clair'
text = str(data)

print(text)
print(type(text))
b'\xc3\xa9clair'
<class 'str'>

However, the str() constructor also accepts an encoding argument, which we can use to create a string based on a specific encoding:

text = str(data, encoding='utf-8')
print(text)
éclair

Converting Bytes to Strings With codecs.decode()

The codecs module in Python's standard library provides the interface for encoding and decoding data. This module also provides a .decode() function that can be used to convert Python bytes to strings:

import codecs

​data = b'\xc3\xa9clair'
text = codecs.decode(data, encoding='utf-8')
print(text)
éclair

These alternative conversion techniques are not methods of the bytes type but functions that require the bytes object to be passed as an argument.

Conclusion

Data is often transferred as bytes, and many Python applications need to convert between bytes and strings. A bytes object is an immutable sequence of bytes that can be converted to a string using one of several character encodings.

The main route to convert Python bytes to a string is the .decode() method, which allows users to choose an encoding and an error-handling strategy.

You can continue your Python learning with the following tutorials and courses:


Stephen Gruppetta's photo
Author
Stephen Gruppetta
LinkedIn
Twitter

I studied Physics and Mathematics at UG level at the University of Malta. Then, I moved to London and got my PhD in Physics from Imperial College. I worked on novel optical techniques to image the human retina. Now, I focus on writing about Python, communicating about Python, and teaching Python.

Topics

Learn Python with these courses!

course

Introduction to Python

4 hr
6M
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

How to Convert String to Bytes in Python

In Python, use the .encode() method on a string to convert it into bytes, optionally specifying the desired encoding (UTF-8 by default).
Stephen Gruppetta's photo

Stephen Gruppetta

7 min

tutorial

Python String format() Tutorial

Learn about string formatting in Python.
DataCamp Team's photo

DataCamp Team

5 min

tutorial

How to Convert a String Into an Integer in Python

Learn how to convert Python strings to integers in this quick tutorial.
Adel Nehme's photo

Adel Nehme

5 min

tutorial

Python String to DateTime: How to Convert Strings to DateTime Objects in Python

Learn all about the Python datetime module in this step-by-step guide, which covers string-to-datetime conversion, code samples, and common errors.
Arunn Thevapalan's photo

Arunn Thevapalan

8 min

tutorial

Python Concatenate Strings Tutorial

Learn various methods to concatenate strings in Python, with examples to illustrate each technique.
DataCamp Team's photo

DataCamp Team

5 min

tutorial

How to Convert a List to a String in Python

Learn how to convert a list to a string in Python in this quick tutorial.
Adel Nehme's photo

Adel Nehme

See MoreSee More