Skip to main content

openpyxl: Automate Excel Tasks with Python

Learn how to read, write, and style Excel files using openpyxl. Discover practical examples for working with modern Excel formats (.xlsx) and advanced features.
May 29, 2025  · 8 min read

openpyxl is an efficient Python library that enables you to read and write Excel files in the modern XML-based formats (.xlsx, .xlsm) introduced in Excel 2007, with full support for advanced features from Excel 2010 onward. It was originally developed as a Python port of PHPExcel and continues to be maintained by an active community of developers. The library excels at data automation, reporting, and formatting workflows, making it especially useful for users who need to work with Excel files but don't have Excel installed on their systems.

Many organizations use openpyxl for automating financial reports, generating charts from database exports, and applying formulas and formatting to marketing reports. This library offers fine-grained control over Excel files entirely through Python code.

What Is openpyxl?

openpyxl is designed specifically to provide native support for Excel 2010+ formats. The library works with several file formats:

  •  .xlsx - Excel workbook

  •  .xlsm - Excel workbook with macros

  •  .xltx - Excel template

  •  .xltm - Excel template with macros

One major advantage of openpyxl is that it doesn't require Excel to be installed on your computer. This makes it ideal for server environments, automated reporting systems, and data processing pipelines. Several other Python libraries, including pandas, use openpyxl as their Excel interface, which demonstrates its reliability and functionality.

Installing and Importing openpyxl

Installing openpyxl is straightforward using pip:

pip install openpyxl

For enhanced security when working with files from untrusted sources, you can also install the optional defusedxml package:

pip install defusedxml

To use openpyxl in your Python scripts, import the necessary classes:

from openpyxl import Workbook, load_workbook

Key Concepts and Terminology

To use openpyxl effectively, you need to understand the basic Excel structure:

  • Workbook: The Excel file itself, containing one or more worksheets
  • Worksheet: Individual tabs/sheets within a workbook
  • Cell: Individual data points in a worksheet, identified by column letter and row number (e.g., "A1")
  • Row: Horizontal line of cells, identified by numbers (1, 2, 3...)
  • Column: Vertical line of cells, identified by letters (A, B, C...)

In openpyxl, you can reference cells using either:

  • Excel style references: sheet[“A1”]

  • Row-column indexing: sheet.cell(row=1, column=1) (Note: openpyxl uses 1-based indexing, not 0-based)

Reading Excel Files with openpyxl

Here's a practical example showing how to read data from an Excel file:

Load and explore a workbook

from openpyxl import load_workbook

# Load the workbook - use read_only=True for large files
wb = load_workbook('sample.xlsx', read_only=False, data_only=False)
# data_only=True reads values instead of formulas

The optional parameters control how the workbook is loaded:

  • read_only=True improves performance for large files but limits editing capabilities

  • data_only=True returns calculated values instead of formulas

Iterate through data

To process multiple cells efficiently, use the iteration methods:

# Iterate through rows
for row in sheet.iter_rows(min_row=1, max_row=5, values_only=True):
    print(row)  # Returns a tuple of values

# Iterate through columns
for column in sheet.iter_cols(min_col=1, max_col=3, values_only=True):
    print(column)  # Returns a tuple of values

# Access a range of cells
cell_range = sheet["A1:C5"]
for row in cell_range:
    for cell in row:
        print(cell.value)

Convert to Python data structures

You can easily transform Excel data into Python data structures:

# Convert worksheet data to a list of dictionaries
def sheet_to_dict(sheet):
    data = []
    headers = [cell.value for cell in sheet[1]]  # First row as headers
    
    for row in sheet.iter_rows(min_row=2, values_only=True):
        row_data = {}
        for key, value in zip(headers, row):
            row_data[key] = value
        data.append(row_data)
    
    return data

# Example usage
data_dict = sheet_to_dict(sheet)
print(data_dict)

Writing Excel Files with openpyxl

Creating and modifying Excel files is equally straightforward with openpyxl.

Create a workbook and write basic data

from openpyxl import Workbook

# Create a new workbook
wb = Workbook()
sheet = wb.active

# Rename the sheet
sheet.title = "Data"

# Write values to cells
sheet["A1"] = "Name"
sheet["B1"] = "Age"
sheet["C1"] = "City"

# Add data
data = [
    ["Alice", 25, "New York"],
    ["Bob", 30, "San Francisco"],
    ["Charlie", 35, "Chicago"]
]

for row_idx, row_data in enumerate(data, start=2):
    for col_idx, cell_value in enumerate(row_data, start=1):
        sheet.cell(row=row_idx, column=col_idx, value=cell_value)

# Save the workbook
wb.save("new_workbook.xlsx")

Modify existing Excel files

from openpyxl import load_workbook

# Load an existing workbook
wb = load_workbook("existing_file.xlsx")
sheet = wb.active

# Modify cell values
sheet["D1"] = "Updated Data"
sheet.cell(row=5, column=2).value = 42

# Save to a new file (to preserve the original)
wb.save("modified_file.xlsx")

Append data dynamically

The .append() method makes it easy to add new rows:

from openpyxl import Workbook

wb = Workbook()
sheet = wb.active

# Add headers
sheet.append(["Date", "Product", "Quantity", "Price"])

# Add multiple rows of data
sales_data = [
    ["2025-04-01", "Laptop", 5, 1200],
    ["2025-04-01", "Mouse", 10, 25],
    ["2025-04-02", "Monitor", 3, 350]
]

for row in sales_data:
    sheet.append(row)

wb.save("sales_report.xlsx")

Formatting and Styling in Excel

openpyxl offers extensive options for cell formatting and styling.

Add fonts, borders, and alignment

from openpyxl import Workbook
from openpyxl.styles import Font, Alignment, Border, Side, PatternFill, NamedStyle

wb = Workbook()
sheet = wb.active

# Create a header style
header_style = NamedStyle(name="header_style")
header_style.font = Font(bold=True, size=12, color="FFFFFF")
header_style.fill = PatternFill(fill_type="solid", start_color="366092")
header_style.alignment = Alignment(horizontal="center", vertical="center")
header_style.border = Border(
    bottom=Side(border_style="medium", color="000000"),
    left=Side(border_style="thin"),
    right=Side(border_style="thin"),
    top=Side(border_style="thin")
)

# Add the style to the workbook
wb.add_named_style(header_style)

# Apply the style to header row
headers = ["ID", "Product", "Category", "Price"]
for col_idx, header in enumerate(headers, start=1):
    cell = sheet.cell(row=1, column=col_idx, value=header)
    cell.style = "header_style"

# Basic styling without named styles
sheet["A2"].font = Font(bold=True, italic=True)
sheet["A3"].alignment = Alignment(horizontal="center")
sheet["A4"].border = Border(bottom=Side(border_style="thin"))

wb.save("styled_workbook.xlsx")

Conditional formatting

Conditional formatting allows you to apply styles based on cell values:

from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill
from openpyxl.styles.differential import DifferentialStyle
from openpyxl.formatting.rule import Rule, ColorScaleRule

wb = Workbook()
sheet = wb.active

# Add sample data
for row in range(1, 11):
    for col in range(1, 5):
        sheet.cell(row=row, column=col, value=(row-1)*4 + col)

# Add color scale (green to red gradient)
color_scale = ColorScaleRule(
    start_type="min", start_color="63BE7B",  # Green
    mid_type="percentile", mid_value=50, mid_color="FFEB84",  # Yellow
    end_type="max", end_color="F8696B"  # Red
)
sheet.conditional_formatting.add("A1:D10", color_scale)

# Add a rule for values less than 5
red_text = Font(color="FF0000")
red_fill = PatternFill(start_color="FFCCCC", end_color="FFCCCC", fill_type="solid")
dxf = DifferentialStyle(font=red_text, fill=red_fill)
rule = Rule(type="cellIs", operator="lessThan", formula=["5"], dxf=dxf)
sheet.conditional_formatting.add("A1:D10", rule)

wb.save("conditional_format.xlsx")

Advanced Features

openpyxl supports many advanced Excel features that help with complex spreadsheet tasks.

Adding formulas

from openpyxl import Workbook

wb = Workbook()
sheet = wb.active

# Add data
for row in range(1, 6):
    sheet.cell(row=row, column=1, value=row)

# Add formulas
sheet["B1"] = "=A1*2"
sheet["B2"] = "=A2*2"
sheet["C1"] = "=SUM(A1:A5)"
sheet["C2"] = "=AVERAGE(A1:A5)"
sheet["C3"] = "=COUNTIF(A1:A5,\">2\")"

wb.save("formulas.xlsx")

Note: When using load_workbook(), formulas are not automatically calculated by openpyxl. To see calculated values, use load_workbook(“file.xlsx”, data_only=True).  

Insert charts

openpyxl can create various chart types:

from openpyxl import Workbook
from openpyxl.chart import BarChart, LineChart, Reference

wb = Workbook()
sheet = wb.active

# Add data
months = ["Jan", "Feb", "Mar", "Apr", "May"]
values = [30, 45, 37, 50, 62]

for i, (month, value) in enumerate(zip(months, values), start=1):
    sheet.cell(row=i, column=1, value=month)
    sheet.cell(row=i, column=2, value=value)

# Create a bar chart
bar_chart = BarChart()
bar_chart.title = "Monthly Sales"
bar_chart.x_axis.title = "Month"
bar_chart.y_axis.title = "Sales"

# Define the data range
data = Reference(sheet, min_col=2, min_row=1, max_row=5)
categories = Reference(sheet, min_col=1, min_row=1, max_row=5)

# Add the data to the chart
bar_chart.add_data(data)
bar_chart.set_categories(categories)

# Add the chart to the worksheet
sheet.add_chart(bar_chart, "D1")

# Create a line chart on the same data
line_chart = LineChart()
line_chart.title = "Monthly Sales Trend"
line_chart.add_data(data)
line_chart.set_categories(categories)
sheet.add_chart(line_chart, "D15")

wb.save("charts.xlsx")

Insert Images

from openpyxl import Workbook
from openpyxl.drawing.image import Image

wb = Workbook()
sheet = wb.active

# Add an image
img = Image("logo.png")

# Resize the image (optional)
img.width = 150
img.height = 75

# Add the image to cell A1
sheet.add_image(img, "A1")

wb.save("with_image.xlsx")

Note: You need the Pillow library installed to work with images: pip install pillow

Sheet, Row, and Column Management

openpyxl provides functions to manage the structure of your Excel files:

Creating and removing sheets

from openpyxl import Workbook

wb = Workbook()

# Create new sheets
wb.create_sheet("Data")
wb.create_sheet("Summary", 0)  # Add at the beginning

# Remove a sheet
wb.remove(wb["Sheet"])  # Remove the default sheet

print(wb.sheetnames)  # ['Summary', 'Data']

wb.save("multiple_sheets.xlsx")

Renaming sheets

wb = Workbook()
sheet = wb.active
sheet.title = "Sales Report"
wb.save("renamed_sheet.xlsx")

Copying worksheets

from openpyxl import Workbook

wb = Workbook()
source = wb.active
source.title = "Original"

# Add some data to copy
source["A1"] = "Test Data"

# Create a copy
wb.copy_worksheet(source)

# The copied sheet will have "Copy of Original" name
wb.save("copied_sheet.xlsx")

Inserting and deleting rows and columns

from openpyxl import Workbook

wb = Workbook()
sheet = wb.active

# Add some data
for i in range(1, 6):
    for j in range(1, 4):
        sheet.cell(row=i, column=j, value=f"R{i}C{j}")

# Insert a row at position 2
sheet.insert_rows(2)

# Insert multiple rows
sheet.insert_rows(5, 3)  # Insert 3 rows at position 5

# Insert a column at position 2
sheet.insert_cols(2)

# Delete rows
sheet.delete_rows(7, 2)  # Delete 2 rows starting at row 7

# Delete columns
sheet.delete_cols(3)  # Delete column C

wb.save("modified_structure.xlsx")

Freezing panes and adding filters

from openpyxl import Workbook

wb = Workbook()
sheet = wb.active

# Add headers
headers = ["ID", "Name", "Department", "Salary"]
for col_idx, header in enumerate(headers, start=1):
    sheet.cell(row=1, column=col_idx, value=header)

# Add data
data = [
    [1, "John Smith", "HR", 55000],
    [2, "Jane Doe", "IT", 65000],
    [3, "Bob Johnson", "Finance", 60000]
]
for row_idx, row_data in enumerate(data, start=2):
    for col_idx, value in enumerate(row_data, start=1):
        sheet.cell(row=row_idx, column=col_idx, value=value)

# Freeze the header row
sheet.freeze_panes = "A2"

# Add autofilter
sheet.auto_filter.ref = "A1:D4"

wb.save("freeze_and_filter.xlsx")

Limitations and Community Feedback

While openpyxl is feature-rich, users have noted some limitations:

  • Limited support for named ranges
  • Documentation examples can be sparse for advanced features
  • Performance can be slow for very large files (though read-only mode helps)

Despite these limitations, openpyxl has strong community support. Issues can be reported on the GitLab tracker, and professional support is sometimes available. For more information and help, you can refer to:

Conclusion

openpyxl is a well-established library for Python-based Excel automation that offers complete support for reading, writing, and styling Excel files. It enables detailed control over spreadsheet output in automated workflows, from manipulating individual cells to creating charts and applying complex formatting.

For users who need to generate or analyze Excel files programmatically, particularly when building reports or processing tabular data, openpyxl provides an extensive toolkit. Its ability to work without requiring Excel installation makes it especially valuable for server environments and automated data pipelines.

Whether you're creating financial models, analyzing data, or generating reports, openpyxl gives you the tools to handle Excel files efficiently through Python code.


Vinod Chugani's photo
Author
Vinod Chugani
LinkedIn

As an adept professional in Data Science, Machine Learning, and Generative AI, Vinod dedicates himself to sharing knowledge and empowering aspiring data scientists to succeed in this dynamic field.

Become a Python Developer

Gain the programming skills all Python Developers need.
Start Learning for Free

openpyxl FAQs

What file types does openpyxl support?

openpyxl supports several Excel formats, including .xlsx, .xlsm, .xltx, and xltm. It does not support the older .xls format used by Excel 2003 and earlier.

Can openpyxl read and evaluate Excel formulas?

openpyxl can read and write formulas, but it does not evaluate them. To see formula results, the file must first be opened and saved in Excel.

Do I need Microsoft Excel installed to use openpyxl?

No, openpyxl works entirely in Python and does not require Excel to be installed on your machine.

​Can I insert images into Excel with openpyxl?

Yes, but you must have the Pillow library installed, and the image file must exist on disk when calling add_image().

Is openpyxl suitable for large Excel files?

Yes, especially when using read_only=True mode, which improves performance for reading large spreadsheets.

Topics

Learn with DataCamp

Track

Associate Python Developer

0 min
Learn Python for software development, from writing functions to defining classes. Get the necessary skills to kickstart your developer career!
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

Tutorial

Python Excel: A Guide With Examples

Learn how to read and import Excel files in Python, write data to these spreadsheets, and find the best packages to do this.
Natassha Selvaraj's photo

Natassha Selvaraj

15 min

Tutorial

VBA Excel: How to Get Started and Make Your Work Easier

Learn how to effectively use VBA in Excel to automate tasks, create macros, and enhance your data processing skills with practical examples and best practices.
Laiba Siddiqui's photo

Laiba Siddiqui

10 min

Tutorial

Excel Macros Tutorial: Automate Formatting, Reporting, and Data Entry

Learn how to automate repetitive Excel tasks with Excel macros. Create, use, and manage macros to boost productivity and transform your data workflows.

Rajesh Kumar

Tutorial

Using Python to Power Spreadsheets in Data Science

Learn how Python can be used more effectively than Excel, with the Pandas package.
Jason Graham's photo

Jason Graham

15 min

Tutorial

How to Analyze Data in Google Sheets With Python: A Step-By-Step Guide

Boost your data analysis skills with our step-by-step guide on how to analyze, manipulate and write back data in Google Sheets using Python.
Filip Schouwenaars's photo

Filip Schouwenaars

11 min

Tutorial

Excel Table: The Essential Tool for Organizing Data

Discover how Excel tables go beyond formatting with powerful features like structured references, automatic expansion, filtering, and integration with PivotTables and charts.
Javier Canales Luna's photo

Javier Canales Luna

9 min

See MoreSee More