トラック概要

プロフェッショナルデータエンジニア Pythonで

当社のプロフェッショナルデータエンジニア養成コースで、スキルを次のレベルへ引き上げましょう。この上級コースは、SQLにおけるアソシエイトデータエンジニアおよびPythonにおけるデータエンジニアのコースを基盤として構築されるように設計されております。現代のデータエンジニアリング職に求められる最先端の知識とツールを身につけることができます。この旅を通じて、現代的なデータアーキテクチャを習得し、オブジェクト指向プログラミングの深い理解を通じてPythonスキルを向上させ、NoSQLデータベースを探求し、シームレスなデータ変換を実現するdbtの力を活用してまいります。 DevOpsの秘訣を解き明かし、必須の実践手法、高度なテスト技術、Dockerなどのツールを活用して、開発とデプロイのプロセスを効率化しましょう。PySparkを用いたビッグデータ技術に没頭し、シェルスクリプトによるデータ処理と自動化の習得を目指しましょう。実践的なプロジェクトに取り組み、実際のデータセットを扱って知識を応用し、複雑なワークフローのデバッグを行い、データ処理を最適化してください。このコースを修了することで、複雑なデータエンジニアリングの課題に取り組むために必要な高度なスキルを習得できるだけでなく、変化の激しいデータエンジニアリングの世界でそれらを自信を持って応用できるようになります。

前提条件

データエンジニア

Course
1
現代のデータアーキテクチャを理解する
最新のデータアーキテクチャを体系的に学習。IngestionからServing、Governance、Orchestrationまで主要要素を網羅します。
Course
2
Shell 入門
Unixのコマンドラインは、ユーザーが既存のプログラムを新たな方法で組み合わせたり、反復的な作業を自動化したり、クラスターやクラウド上でプログラムを実行したりするのに役立ちます。
Course
3
コンテナ化と仮想化の基礎概念
仮想マシン（VM）、コンテナ、Docker、およびKubernetesの基礎を学びましょう。違いを理解して始めましょう！
Course
4
dbt入門
本コースでは、データモデリング、変換、テスト、およびドキュメント作成のためのdbtについてご説明いたします。
Course
5
Introduction to Object-Oriented Programming in Python
Discover the fundamental concepts of object-oriented programming (OOP), building custom classes and objects!
Course
6
NoSQL入門
NoSQLを習得しデータ基盤を強化。Snowflakeでビッグデータを扱い、PostgresのJSONでドキュメントを処理し、Redisでキー値データを管理。
Course
7
DevOpsの基礎概念
DevOps入門で基礎を習得し、生産性を高める主要な概念・ツール・手法を学びます。
Course
8
Pythonによるテスト入門
Pythonのテストを習得。手法を学び、チェックを作成し、pytestとunittestでエラーのないコードを保証します。
Project
ボーナス
Debugging Code
Sharpen your debugging skills to enhance sales data accuracy.
Course
10
Docker入門
Dockerの概要について学び、データ専門家のツールキットにおけるその重要性についてご説明いたします。Dockerコンテナやイメージなどについて学びましょう。
Course
11
PySpark入門
PySparkを習得し、ビッグデータを容易に扱えるようになろう。大規模なデータセットを処理し、クエリを実行し、最適化して、強力な分析を実現する方法を学びましょう！
Chapter
ボーナス
Introduction to Big Data analysis with Spark
This chapter introduces the exciting world of Big Data, as well as the various concepts and different frameworks for processing Big Data. You will understand why Apache Spark is considered the best framework for BigData.
Chapter
ボーナス
Programming in PySpark RDD’s
The main abstraction Spark provides is a resilient distributed dataset (RDD), which is the fundamental and backbone data type of this engine. This chapter introduces RDDs and shows how RDDs can be created and executed using RDD Transformations and Actions.
Chapter
ボーナス
PySpark SQL & DataFrames
In this chapter, you'll learn about Spark SQL which is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. This chapter shows how Spark SQL allows you to use DataFrames in Python.
Project
ボーナス
Cleaning an Orders Dataset with PySpark
Step into a data engineer's shoes and master data cleaning with PySpark on an e-commerce orders dataset!
Chapter
ボーナス
Downloading Data on the Command Line
In this chapter, we learn how to download data files from web servers via the command line. In the process, we also learn about documentation manuals, option flags, and multi-file processing.
Chapter
ボーナス
Data Pipeline on the Command Line
In the last chapter, we bridge the connection between command line and other data science languages and learn how they can work together. Using Python as a case study, we learn to execute Python on the command line, to install dependencies using the package manager pip, and to build an entire model pipeline using the command line.
Course
18
ストリーミングの基礎概念
バッチ処理とストリーミングの違い、ストリーミングのスケーリング方法、実運用での活用例を学びます。
Course
19
Apache Kafka 入門
Apache Kafkaを基礎からアーキテクチャまで習得。実運用のデータストリーミングに向けて、作成・管理・トラブル対応を学びます。
Course
20
Kubernetes入門
本コースでは、Kubernetesの基礎を学び、Manifestsとkubectlでコンテナをデプロイしオーケストレーションします。
Resource
ボーナス
Impactful Data Engineering—with Datadog's Wouter de Bie
Understand how data engineering can impact your business.