Skip to main content
HomeBlogData Science

Which Data Science Technology Should I Learn?

Discover whether you should learn a data science and analytics language, such as Python or R; SQL; spreadsheets; or a business intelligence tool.
Apr 2020  · 5 min read

Not sure where to start? Our Understanding Data Science course provides a non-coding overview of data science essentials.

Or keep reading for more detailed information about the technologies taught on DataCamp.

Data science and analytics languages

If you’re new to data science and analytics, or your organization is, you’ll need to pick a language to analyze your data and a thoughtful way to make that decision. Read our blog post and tutorial to learn how to choose between the two most popular languages for data science—Python and R—or read on for a brief summary.

Python

Python is one of the world’s most popular programming languages. It is production-ready, meaning it has the capacity to be a single tool that integrates with every part of your workflow. So whether you want to build a web application or a machine learning model, Python can get you there!

  • General-purpose programming language (can be used to make anything)
  • Widely considered one of the accessible programming languages to read and learn
  • The language of choice for cutting edge machine learning and AI applications
  • Commonly used for putting models "in production"
  • Has high ease of deployment and reproducibility

Start learning Python here.

R

R has been used primarily in academics and research, but in recent years, enterprise usage has rapidly expanded. Built specifically for working with data, R provides an intuitive interface to the most advanced statistical methods available today.

  • Built specifically for data analysis and visualization
  • Traditionally used by statisticians and academic researchers
  • The language of choice for cutting edge statistics
  • A vast collection of community-contributed packages
  • Rapid prototyping of data-driven apps and dashboards

Start learning R here.

SQL

Much of the world's raw data lives in organized collections of tables called relational databases. Data analysts and data scientists must know how to wrangle and extract data from these databases using SQL.

  • Useful for every organization that stores information in databases
  • One of the most in-demand skills in business
  • Used to access, query, and extract structured data which has been organized into a formatted repository, e.g., a database
  • Its scope includes data query, data manipulation, data definition, and data access control

Start learning SQL here.

Databases

Data scientists, analysts, and engineers must constantly interact with databases, which can store a vast amount of information in tables without slowing down performance. You can use SQL to query data from databases and model different phenomena in your data and the relationships between them. Find out the differences between the most popular databases in our blog post or read on for a summary.

Microsoft SQL Server

  • Commercial relational database management system (RDBMS), built and maintained by Microsoft
  • Available on Windows and Linux operating systems

PostgreSQL

  • Free and open-source RDBMS, maintained by PostgreSQL Global Development Group and its community
  • Beginner-friendly

Oracle Database

  • The most popular RDBMS, used by 97% of Fortune 100 companies
  • Requires knowledge of PL/SQL, an extension of SQL, to access and query data

Spreadsheets

Spreadsheets are used across the business world to transform mountains of raw data into clear insights by organizing, analyzing, and storing data in tables. Microsoft Excel and Google Sheets are the most popular spreadsheet software, with a flexible structure that allows data to be entered in cells of a table.

Google Sheets

  • Free for users
  • Allows collaboration between users via link sharing and permissions
  • Statistical analysis and visualization must be done manually

Microsoft Excel

  • Requires a paid license
  • Not as favorable as Google Sheets for collaboration
  • Contains built-in functions for statistical analysis and visualization

Business intelligence tools

Business intelligence (BI) tools make data discovery accessible for all skill levels—not just advanced analytics professionals. They are one of the simplest ways to work with data, providing the tools to collect data in one place, gain insight into what will move the needle, forecast outcomes, and much more.

Tableau

Tableau is a data visualization software that is like a supercharged Microsoft Excel. Its user-friendly drag-and-drop functionality makes it simple for anyone to access, analyze and create highly impactful data visualizations.

  • A widely used business intelligence (BI) and analytics software trusted by companies like Amazon, Experian, and Unilever
  • User-friendly drag-and-drop functionality
  • Supports multiple data sources including Microsoft Excel, Oracle, Microsoft SQL, Google Analytics, and SalesForce

Start learning Tableau here.

Microsoft Power BI

Microsoft Power BI allows users to connect and transform raw data, add calculated columns and measures, create simple visualizations, and combine them to create interactive reports.

  • Web-based tool that provides real-time data access
  • User-friendly drag-and-drop functionality
  • Leverages existing Microsoft systems like Azure, SQL, and Excel

Start learning Power BI here.

Shell

Shell provides command-line interface which allows you to control your computer's operating system with just a few keystrokes. Sometimes called "the universal glue of programming," it helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds that may be halfway around the world.

Scala

Scala is a hybrid object-oriented and functional programming language popular for large-scale applications and data engineering infrastructure. Favored by companies like Netflix, Airbnb, and Morgan Stanley, Scala improves productivity, application scalability, and reliability.

Git

Version control is one of the power tools of programming. It allows you to keep track of what you did when, undo any changes you decide you don't want, and collaborate at scale with other people. Git is a modern version control tool that is very popular with data scientists and software developers, and allows you to get more done in less time and with less pain.

Related

Top 10 Data Science Tools To Use in 2024

The essential data science tools for beginners and data practitioners to efficiently ingest, process, analyze, visualize, and model the data.
Abid Ali Awan's photo

Abid Ali Awan

9 min

Google Cloud for Data Scientists: Harnessing Cloud Resources for Data Analysis

How can using Google Cloud make data analysis easier? We explore examples of companies that have already experienced all the benefits.
Oleh Maksymovych's photo

Oleh Maksymovych

9 min

A Guide to Docker Certification: Exploring The Docker Certified Associate (DCA) Exam

Unlock your potential in Docker and data science with our comprehensive guide. Explore Docker certifications, learning paths, and practical tips.
Matt Crabtree's photo

Matt Crabtree

8 min

Bash & zsh Shell Terminal Basics Cheat Sheet

Improve your Bash & zsh Shell skills with the handy shortcuts featured in this convenient cheat sheet!
Richie Cotton's photo

Richie Cotton

6 min

Functional Programming vs Object-Oriented Programming in Data Analysis

Explore two of the most commonly used programming paradigms in data science: object-oriented programming and functional programming.
Amberle McKee's photo

Amberle McKee

15 min

A Comprehensive Introduction to Anomaly Detection

A tutorial on mastering the fundamentals of anomaly detection - the concepts, terminology, and code.
Bex Tuychiev's photo

Bex Tuychiev

14 min

See MoreSee More