Skip to main content

How to Analyze Data in Google Sheets With Python: A Step-By-Step Guide

Boost your data analysis skills with our step-by-step guide on how to analyze, manipulate and write back data in Google Sheets using Python.
May 2023  · 11 min read

Companies run on spreadsheets. From financial management and budgeting to project management and KPI reporting: every team relies on spreadsheets in some shape or form.

If you want to be effective as an analyst in this spreadsheet-driven environment, it’s crucial that you know how to programmatically access data in spreadsheets as well as write the results of your analyses back to spreadsheets.

By automating the process, you can meet your stakeholders where their data is stored and avoid the tedium of manually exporting and importing data into spreadsheets.

In this tutorial, we’ll cover all the necessary steps to access data that lives in a Google Sheet spreadsheet, analyze it, and write it back to a spreadsheet. We will do all of this with Python with no repetitive manual steps involved!

We’ll use DataCamp Workspace in this tutorial, as it has built-in functionality for securely storing secrets as environment variables, a feature we’ll use to store Google API credentials. To follow along, all you need is a Google account (to house the spreadsheet) and a DataCamp account (to use DataCamp Workspace), both of which you can create for free!

More specifically, the steps we will take are:

  • Configure a Google Service Account
  • Create a Google spreadsheet with data
  • Create a new workspace
  • Store your service account credentials in Workspace
  • Read Google Sheets data
  • Analyze the data in Python
  • Write results to Google Sheets

Below, you’ll find the step-by-step instructions for how to analyze data in Google Sheets with Python.

1. Configure a Google Service Account

To programmatically access data in Google Sheets, you need to create a Google service account, which is a special kind of account that can be used by programs to access Google resources like a spreadsheet. You will use this service account to connect DataCamp Workspace to Google Sheets.

You only have to set up this Google service account once for every Google account that you want to access spreadsheets in; you can skip this step the next time you want to work with spreadsheets data in the same account.

Follow the steps below to create the service account and generate the necessary credentials.

Setting up a Google Service Account
  • Make sure you’re signed in with your Google account.
  • Navigate to the Google API Library
  • Create a new project by clicking in the dropdown on the navbar.
  • Search for the “Google Sheets API” and enable it. This can take up to 10 seconds.
  • In the “APIs and services” navbar on the left, go to the “Credentials tab”
  • Click on “+ CREATE CREDENTIALS” and select “Service Account”
    • In the first step (service account details), provide a name for the service account, e.g., “gsheet-operator” and click on “Create and continue”
    • In the second step, select the “Owner” role and click “Continue”
    • In the third step, don’t change anything and click “Done”
  • Once back on the Credentials page, click on the service account you just created.
  • Go to the Keys tab, click “Add Key > Create new key”
  • Choose “JSON,” then click “Create.” The JSON file with your service account credentials will automatically download to your computer.

You now have a service account and a JSON credentials file! Head over to your Downloads folder or wherever the JSON file was downloaded, open it up, and have a look. It should look something like this:

  "type": "service_account",
  "project_id": "<your-project-name>",
  "private_key_id": "<something-private>",
  "private_key": "-----BEGIN PRIVATE KEY-----\nM<some-very-private-stuff\n",
  "client_email": "[email protected]",
  "client_id": "123456789012345678901",
  "auth_uri": "",
  "token_uri": "",
  "auth_provider_x509_cert_url": "",
  "client_x509_cert_url": "<project-name>"

There’s a “client_email” field in there: gsheets-operator@<google-project-name> Copy this email to your clipboard; you’ll need it in the next step.

2. Create a Google Spreadsheet With Data

Before you can analyze data in spreadsheets, you need to make sure you have a spreadsheet with data in it. If you don’t have a dataset lying around, you can start from a Google Sheet that we prepared for this tutorial: Open the example spreadsheet and once in the Google Sheet, click “File > Make a copy,” specify a name, and click “Make a copy.” If you already have a spreadsheet with data you want to analyze, just open up that spreadsheet.

Regardless of whether you’re working with a duplicate of the example spreadsheet or your own spreadsheet, you need to give the Google service account that you created in the first step access to the spreadsheet:

Google Share

Alright, service account: check. Google spreadsheet with the right access: check. It’s time to write some Python code!

3. Create a New Workspace

In this tutorial, we’ll be using DataCamp Workspace, a modern data science notebook with Python and all common data science packages preconfigured. A data project inside Workspace is called a workspace. Let’s create one!

You can either create an empty workspace from scratch, which is useful if you want to write all the code yourself or create a workspace that already has most of the code in this tutorial included by clicking the  "Open Workspace" button below.

Run and edit the code from this tutorial online

Open Workspace

4. Store Your Service Account Credentials in Workspace

To use the service account credentials JSON in your newly created workspace, you need to store it in Workspace.

You could copy and paste the contents of the JSON inside a code cell of your newly created workspace:

JSON Wrong

This is not a secure approach and not recommended. If you share your notebook with someone else, they can see the credentials in clear text. They would then be able to impersonate you and get access to resources this service account has access to.

This approach would also require you to repeat this code in every workspace that you want to use to access spreadsheet data. If you want to rotate the credentials, you’ll have to go through and edit every workspace.

A more secure and scalable approach is storing the JSON in a so-called environment variable. When you connect this environment variable, it will become available in your Python session. In your workspace:

  • Click the “Environment” tab on the left-hand side
  • Click on the plus icon next to “Environment Variables”
  • In the “Add Environment Variables” modal:
    • Set “Name” to GOOGLE_JSON
    • Set “Value” to the full contents of the service account JSON file that was downloaded. You can do this by opening the JSON file, selecting all, copying it to your clipboard, and then pasting it in the Value field.
    • Set the “Environment Variables Set Name” to “Google Service Account” (this can be anything, really)

Setting up GOOGLE_JSON environment variable

After filling in all fields, click “Create,” “Next,” and finally, “Connect.” Your workspace session will restart, and GOOGLE_JSON will now be available as an environment variable in your workspace. You can verify this by creating a Python cell with the following code and running it:

import os
google_json = os.environ["GOOGLE_JSON"]

If you want to reuse the same services account credentials in another workspace, you don’t need to set up the environment variable again: you can reuse the environment variable in other workspaces.

5. Read Google Sheets Data

Now for the fun part: writing Python code to connect to the Google Sheets data.

Install and import the right packages

Start by installing and importing all the packages required to set up a connection to your Google API account. We’ll be using google-auth to authenticate against the Google API, gspread to easily interface with Google spreadsheets, and pandas to manipulate tabular data in Python.

!pip install gspread
from google.oauth2 import service_account
import pandas as pd
import gspread
import json
import os

Set up a client with scoped credentials

Load the GOOGLE_JSON environment variable and store it in a variable:

import os
google_json = os.environ["GOOGLE_JSON"]

Create a credentials object from the service account credentials json:

service_account_info = json.loads(google_json)
credentials = service_account.Credentials.from_service_account_info(service_account_info)

Assign the right ‘API scope’ to the credentials so that it has the right permissions to interface with the Google Sheet spreadsheet:

scope = ['','']
creds_with_scope = credentials.with_scopes(scope)

Create a new client with these scoped credentials:

client = gspread.authorize(creds_with_scope)

Read spreadsheets data into a DataFrame

With the client ready, we can read the data from the spreadsheet created in the first section of this article into a pandas DataFrame. A pandas DataFrame is a table-like data structure that you can use in Python to store and manipulate data.

First, create a new sheet instance from a spreadsheet URL:

spreadsheet = client.open_by_url('')

Next, zoom in on the worksheet (the tab in the Google spreadsheet) you want to get the data from. Let’s get the first workspace using the index 0:

worksheet = spreadsheet.get_worksheet(0)

First, get all the records as a JSON:

records_data = worksheet.get_all_records()

Next, convert the JSON records into a pandas DataFrame:

records_df = pd.DataFrame.from_dict(records_data)


6. Analyze the Spreadsheets Data in Python

You can now use Python to analyze the data you read in from the Google spreadsheet. The minimal code below assumes that you are working with the example spreadsheet. If you are working with another dataset, these steps won’t work. Check out DataCamp’s Data Manipulation with pandas course if you want to dive deeper!

Let’s calculate median funding by country:

median_funding_df = records_df.groupby("country")["funding"].median().reset_index()
median_funding_df.columns = ['country', 'median_funding']

7. Write the Results to Google Sheets

If you want the spreadsheet users you’re collaborating with to see the fruits of your labor in a spreadsheet, you can write back the results of your Python analysis back to a new Google worksheet (i.e., tab) inside the original spreadsheet.

In this example, we create a new worksheet median_country if it doesn’t exist yet, and then insert the contents of the median_funding_df DataFrame, including the header with column names. This code was generated with the help of the AI Assistant!

# Convert the median_funding_df DataFrame to a list of lists
medians_list = median_funding_df.values.tolist()

# Add the header row to the list
medians_list.insert(0, median_funding_df.columns.tolist())

# Check if the 'median_by_country' sheet already exists
sheet_titles = [sheet.title for sheet in spreadsheet.worksheets()]
sheet_name = "median_by_country"
if sheet_name not in sheet_titles:
    # Create a new worksheet named 'median_by_country' in the Google Sheet
    medians_sheet = spreadsheet.add_worksheet(title=sheet_name, rows=len(medians_list), cols=len(medians_list[0]))
    medians_sheet = spreadsheet.worksheet(sheet_name)

# Write the medians_list to the new worksheet
medians_sheet.insert_rows(medians_list, row=1)

If you visit your spreadsheet again, you’ll see a new worksheet with the data from the DataFrame!


The initial setup before you can access data in Google Sheets with Python can take a while. However, once things are properly configured, accessing data from Google Sheets is almost as easy as querying a SQL database or reading in a CSV. The big advantage, though, is that you can meet your stakeholders where they are: it’s likely that some of your colleagues swear by spreadsheets.

Out with the times of laborious CSV exports and in with the times of powerful, automatic workflows that read from and write to spreadsheets without ever leaving DataCamp Workspace.

DataCamp Workspace

Skip the installation process, and get started with Python on your browser using DataCamp Workspace

Learn More

The 7 Best Python Certifications For All Levels

Find out whether a Python certification is right for you, what the best options are, and the alternatives on offer in this comprehensive guide.
Matt Crabtree's photo

Matt Crabtree

18 min

How to Earn a Microsoft Excel Certification in 2024: Top Tips and Resources

Discover step-by-step instructions, exam insights, and expert tips to achieve Excel certification.
Matt Crabtree's photo

Matt Crabtree

12 min

Top 25 Excel Interview Questions For All Levels

A guide to the most common Excel interview questions for beginner, intermediate, and advanced users to ace the technical interview.
Chloe Lubin's photo

Chloe Lubin

17 min

Excel Shortcuts Cheat Sheet

Improve on your Excel skills with the handy shortcuts featured in this convenient cheat sheet!
Richie Cotton's photo

Richie Cotton

4 min

Coding Best Practices and Guidelines for Better Code

Learn coding best practices to improve your programming skills. Explore coding guidelines for collaboration, code structure, efficiency, and more.
Amberle McKee's photo

Amberle McKee

26 min

Pandas Profiling (ydata-profiling) in Python: A Guide for Beginners

Learn how to use the ydata-profiling library in Python to generate detailed reports for datasets with many features.
Satyam Tripathi's photo

Satyam Tripathi

9 min

See MoreSee More