A relational database is a set of tables (datasets with rows and columns) that contain information relating to other tables in the database.
The diagram below contains information about columns in two tables in an example relational database. Both tables contain columns named customer_id, which establishes a relationship between the tables. As the company grows and records thousands (or millions) of orders, storing data in separate tables helps optimize for space and reduce the size of the database.
SQL, or Structured Query Language, is the standard language for interacting with relational databases. With SQL, you can query, or ask questions of, the data in a relational database. Working with SQL and relational databases is an invaluable skill set for a data analyst, data engineer, or a data scientist.
If you have started looking for ways to learn SQL, you may have noticed the many different dialects of SQL available to learn with some clear (and less clear) distinctions between the different dialects. So where do you begin? Which version of SQL is most helpful to you if you haven’t used it before? In this article, we will focus on four of the most popular database management systems -- PostgreSQL, MySQL, SQLite, and SQL Server -- and their versions of SQL syntax.
The graph below from Stack Overflow Trends provides a sense of how often each of these platforms is discussed -- each line represents the percentage of all Stack Overflow questions about each version of SQL.
MySQL has consistently been the most popular version of SQL in Stack Overflow questions. Second in line is Microsoft SQL Server (including T-SQL, the name of Microsoft’s dialect of SQL), which remains a consistently more popular tag than PostgreSQL and SQLite. This means that if you have a question specific to one of these systems, you’re more likely to find that someone already asked your question.
What’s the difference?
PostgreSQL, MySQL, and SQLite use very similar syntax, with some notable differences highlighted below. Microsoft SQL Server has the greatest contrast in SQL syntax, as well as a wide variety of functions not available in other platforms. The table below highlights some examples of basic differences between SQL platforms.
|SELECT ...||Select [col1], [col2]||SELECT col1, col2||SELECT col1, col2||SELECT col1, col2|
|Data from tables is case sensitive?||Yes WHERE name = ‘John’ Or WHERE name = ‘john’ are not the same||No WHERE name = ‘John’ Or WHERE name = ‘john’ are the same||Yes WHERE name = ‘John’ Or WHERE name = ‘john’ are not the same||Yes WHERE name = ‘John’ Or WHERE name = ‘john’ are not the same|
|Using quotation marks||name = ‘John’ only||name = ‘John’ or name = “John”||name = ‘John’ only||name = ‘John’ or name = “John”|
|Aliases for columns and tables||SELECT AVG(col1)=avg1||SELECT AVG(col1) AS avg1||SELECT AVG(col1) AS avg1||SELECT AVG(col1) AS avg1|
|Working with dates||GETDATE() DATEPART()||CURDATE() CURTIME() EXTRACT()||CURRENT_DATE() CURRENT_TIME() EXTRACT()||DATE(‘now’) strftime()|
|Window functions i.e., OVER(), PARTITION BY()||Yes||Yes||Yes||No (need to use subqueries instead)|
Where do I start?
For students who have little to no experience with SQL and are looking to gain the most broadly applicable skills, I recommend starting with PostgreSQL. Despite the overwhelming popularity of MySQL, PostgreSQL may be a better choice because its syntax most closely conforms to Standard SQL. This means that you can easily translate your skills to other database management systems such as MySQL or SQLite. For example, the query below aggregates data from a database of sales information. It contains a join, an aggregate function, and a filter. This syntax will generate identical results in all three database systems.
Select c.customer_name, SUM(p.amount) AS total_sales FROM customers AS c LEFT JOIN purchases AS p ON c.customers_id = p.customer_id WHERE c.customer_location = 'USA' GROUP BY c.customer_name;
If you anticipate working with Microsoft SQL Server in your career, I recommend you start by learning T-SQL/Microsoft SQL. SQL Server continues to maintain a sizable market share and is an important database management system in many industries. If you are unsure which of these two is more important for you, I recommend browsing through job openings in your field to determine if there is a preference for a specific database system in specific roles or industries.
How can DataCamp help?
At DataCamp, we currently offer two courses in SQL that cover introductory topics and joins in PostgreSQL. We have some exciting courses in development covering intermediate and advanced topics in PostgreSQL. We also have several SQL Server courses in development, including an Introduction to T-SQL/Microsoft SQL course that will provide you with a clear foundation for working with SQL Server. You can check out our course roadmap for more information.
If you’re looking to practice interacting with a PostgreSQL database on your own, I recommend exploring data sets on Kaggle. If you find something that interests you, go ahead and import into PostgreSQL (CSV or SQLite files will import into PostgreSQL) and start exploring!
Come teach with us!
Are you interested in sharing your knowledge and expertise with our 3 million students on DataCamp? We are always looking for instructors to contribute to our growing course library. Our course wishlist highlights the skills we are hoping to make available to our students in the coming months. You can apply here at this link. We look forward to working with you!
Courses for SQL