Skip to main content

Working with Spreadsheets in SQL

In this tutorial, learn how to import a spreadsheet into PostgreSQL and perform analysis on it.
Feb 2019  · 5 min read

Spreadsheets are probably one of the most pertinent elements in an organization (be it of any kind). Often, there are situations where you need to have a very important spreadsheet(s) imported in a database server so that it can effectively be queried. In this tutorial, you will learn how to do this in PostgreSQL.

This tutorial assumes that you are already familiar with SQL and how to write simple SQL queries in PostgreSQL. If you are new to SQL, following are some resources which might help you -

This tutorial is divided into following two major sections -

  • Converting a spreadsheet into a .csv file
  • Importing a .csv file into a PostgreSQL database

Converting a spreadsheet into a .csv file

To get started you will need a spreadsheet to proceed with. You can create a spreadsheet in any Excel editor like Google Sheets, MS-Excel, etc. To learn the basics about spreadsheets, you can take DataCamp's Spreadsheet Basics course or read this beginner spreadsheet tutorial. To be able to follow along with this tutorial, you can use this spreadsheet containing the following columns -

  • ID
  • Name
  • Age
  • Percentage Marks

The spreadsheet has a total of 20 records.

Let's quickly look at snapshot (containing the first 10 rows) of the spreadsheet - spreadsheet

The spreadsheet contains information of a few college students.

We can infer the probable datatypes of the columns from the snapshot -

  • ID, Age and Percentage Marks are numeric columns. More specifically, ID and Age columns can be of integer type and the column Percentage Marks can be of float type.
  • The column Name is of character datatype.

The easiest way to import this spreadsheet (you can refer to it as dataset as well) into a PostgreSQL database is first to convert the spreadsheet into a .csv file and then import it in a PostgreSQL database. CSV stands for comma-separated values. This is two main advantages -

  • .csv files are much lighter than spreadsheets and therefore easier to handle in low memory.
  • There are many in-built utilities in PostgreSQL to handle .csv files in efficient ways.

Every Excel editor lets you convert a spreadsheet into a .csv file. On Google Sheets, you can do it using this navigation - File -> Download as -> Comma-separated values.

Importing a .csv file into a PostgreSQL database

Now to be able to import the .csv file you just created into a PostgreSQL database, you will need a PostgreSQL database first. Let's quickly create one using the pgAdmin interface.

You can create a database with any name with your choice. When you are done with that, you will need to create a table in that database which will hold the entries of the .csv file which you are going to import. Let's create a table named Student_Records in the database you just created.

CREATE TABLE Student_Records
(
  ID serial NOT NULL,
  Name varchar(50),
  Age SMALLINT,
  Percentage_Marks float8
);

You kept the datatypes of the columns as discussed above. For convenience, you kept the column names the same as well. Now navigate to the table you just created in pgAdmin and right click on the table and select Import from the options. For your reference - https://www.loom.com/share/2eeb1a2c94bf4f928c3ffa5b1b72eef4.

You can ignore the errors that happened during the imports as seen in the screencast. You can verify if the records were correctly imported or not by running a select query.

Let's now how the above can be done using the psql prompt. For doing that let's first truncate the table Student_Records using truncate table Student_Records; query.

Now, open up a psql prompt which should look like - psql prompt

Now switch to the database that you created for importing the spreadsheet initially. You can switch to a database in PostgreSQL using the following command -

\c database_name

If it was successful the psql prompt will get changed to the name of the database you specified. You can now issue the following command to have the .csv file imported into Students_Records table -

COPY Student_Records(ID,Name,Age,Percentage_Marks) FROM 'D:\Students.csv' DELIMITER ',' CSV HEADER;

Now execute a select query to verify if the import was done right.

You find more about the Copy command from here and play with its different options.

Conclusion

That is all for this tutorial. You can now import a spreadsheet into PostgreSQL and do your analysis just as you do with standard SQL tables.

SQL and Spreadsheets Courses

Introduction to SQL

Beginner
2 hr
202.2K
Learn how to create and query relational databases using SQL in just two hours.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

Inside Our Favorite DataFramed Episodes of 2022

An inside look at our favorite episodes of the DataFramed podcast of 2022
Adel Nehme's photo

Adel Nehme

2 min

[Infographic] Data Science Project Checklist

Use this checklist when planning your next data science project.
Adel Nehme's photo

Adel Nehme

Reshaping Data with pandas in Python

Pandas DataFrames are commonly used in Python for data analysis, with observations containing values or variables related to a single object and variables representing attributes across all observations.
Richie Cotton's photo

Richie Cotton

Reshaping Data with tidyr in R

In this cheat sheet, you will learn how to reshape data with tidyr. From separating and combining columns, to dealing with missing data, you'll get the download on how to manipulate data in R.
Richie Cotton's photo

Richie Cotton

6 min

ChatGPT Cheat Sheet for Data Science

In this cheat sheet, gain access to 60+ ChatGPT prompts for data science tasks.
Travis Tang's photo

Travis Tang

10 min

Data Quality Dimensions Cheat Sheet

In this cheat sheet, you'll learn about data quality dimensions, allowing you to ensure that your data is fit for purpose.
Joe Franklin's photo

Joe Franklin

3 min

See MoreSee More