メインコンテンツへスキップ

ホーム Shell

コース

Data Processing in Shell

中級スキルレベル

更新日 2025/10

Learn powerful command-line skills to download, process, and transform data, including machine learning pipeline.

コースを無料で開始

ShellData Manipulation

4時間

13 ビデオ

46 演習

3,550 XP

22,934

修了証明書

何千もの企業の従業員が支持

チームのトレーニングを担当していますか？

Businessをお試しください

コース説明

We live in a busy world with tight deadlines. As a result, we fall back on what is familiar and easy, favoring GUI interfaces like Visual Studio and RStudio. However, taking the time to learn data analysis on the command line is a great long-term investment because it makes us stronger and more productive data people.In this course, we will take a practical approach to learn simple, powerful, and data-specific command-line skills. Using publicly available Spotify datasets, we will learn how to download, process, clean, and transform data, all via the command line. We will also learn advanced techniques such as command-line based SQL database operations. Finally, we will combine the powers of command line and Python to build a data pipeline for automating a predictive model.

前提条件

Introduction to Shell Intermediate Python Intermediate SQL

1

Downloading Data on the Command Line

In this chapter, we learn how to download data files from web servers via the command line. In the process, we also learn about documentation manuals, option flags, and multi-file processing.

Downloading data using curl

Using curl documentation

Downloading single file using curl

Downloading multiple files using curl

Downloading data using Wget

Installing Wget

Downloading single file using wget

Advanced downloading using Wget

Setting constraints for multiple file downloads

Creating wait time using Wget

Data downloading with Wget and curl

チャプターを開始

2

Data Cleaning and Munging on the Command Line

We continue our data journey from data downloading to data processing. In this chapter, we utilize the command line library csvkit to convert, preview, filter and manipulate files to prepare our data for further analyses.

Getting started with csvkit

Installation and documentation for csvkit

Converting and previewing data with csvkit

File conversion and summary statistics with csvkit

Filtering data using csvkit

Printing column headers with csvkit

Filtering data by column with csvkit

Filtering data by row with csvkit

Stacking data and chaining commands with csvkit

Stacking files with csvkit

Chaining commands using operators

Data processing with csvkit

チャプターを開始

3

Database Operations on the Command Line

In this chapter, we dig deeper into all that csvkit library has to offer. In particular, we focus on database operations we can do on the command line, including table creation, data pull, and various ETL transformation.

Pulling data from database

Using sql2csv documentation

Understand sql2csv connectors

Practice pulling data from database

Manipulating data using SQL syntax

Applying SQL to a local CSV file

Cleaner scripting via shell variables

Joining local CSV files using SQL

Pushing data back to database

Practice pushing data back to database

Database and SQL with csvkit

チャプターを開始

4

Data Pipeline on the Command Line

In the last chapter, we bridge the connection between command line and other data science languages and learn how they can work together. Using Python as a case study, we learn to execute Python on the command line, to install dependencies using the package manager pip, and to build an entire model pipeline using the command line.

Python on the command line

Finding Python version on the command line

Executing Python script on the command line

Python package installation with pip

Understanding pip's capabilities

Installing Python dependencies

Running a Python model

Data job automation with cron

Understanding cron scheduling syntax

Scheduling a job with crontab

Model production on the command line

Course recap

チャプターを開始

Data Processing in Shell

コース完了

修了証明書を取得

この修了書をLinkedInや履歴書、CVに追加しましょう
ソーシャルメディアや人事評価で共有しましょう今すぐ登録

19百万人を超える学習者と共にData Processing in Shellを始めましょう！

DataCamp for Mobileでデータスキルを磨きましょう

モバイルコースと毎日の 5 分間のコーディングチャレンジで、外出先でも進歩できます。