Descrição da Trilha
"Descubra os principais componentes da arquitetura de dados moderna: ingestão, serviço, governança e orquestração."
A linha de comando Unix ajuda a combinar programas, automatizar tarefas e executar em clusters e nuvens.
Learn the essentials of VMs, containers, Docker, and Kubernetes. Understand the differences to get started!
"Este curso introduz dbt para modelagem de dados, transformações, testes e criação de documentação."
"Descubra os conceitos fundamentais da programação orientada a objetos (POO), criando classes e objetos!"
Conquiste o NoSQL e otimize fluxos de dados. Aprenda Snowflake para big data, Postgres JSON para documentos e Redis para chave-valor.
In this Introduction to DevOps, you’ll master the DevOps basics and learn the key concepts, tools, and techniques to improve productivity.
Domine testes em Python: Aprenda métodos, crie verificações e garanta código sem erros com pytest e unittest.
Sometimes, things that once worked perfectly suddenly hit a snag. Practice your knowledge of DataFrames to find the problem and fix it!
Conheça o Docker e sua importância para profissionais de dados. Aprenda sobre contêineres e imagens Docker.
In this chapter, you'll learn how Spark manages data and how can you read and write tables from Python.
In this chapter, you'll learn about the pyspark.sql module, which provides optimized data queries to your Spark session.
This chapter introduces the exciting world of Big Data, as well as the various concepts and different frameworks for processing Big Data. You will understand why Apache Spark is considered the best framework for BigData.
The main abstraction Spark provides is a resilient distributed dataset (RDD), which is the fundamental and backbone data type of this engine. This chapter introduces RDDs and shows how RDDs can be created and executed using RDD Transformations and Actions.
In this chapter, you'll learn about Spark SQL which is a Spark module for structured data processing. It provides a programming abstraction called DataFrames and can also act as a distributed SQL query engine. This chapter shows how Spark SQL allows you to use DataFrames in Python.
Step into a data engineer's shoes and master data cleaning with PySpark on an e-commerce orders dataset!
In this chapter, we learn how to download data files from web servers via the command line. In the process, we also learn about documentation manuals, option flags, and multi-file processing.
In the last chapter, we bridge the connection between command line and other data science languages and learn how they can work together. Using Python as a case study, we learn to execute Python on the command line, to install dependencies using the package manager pip, and to build an entire model pipeline using the command line.
Learn about the difference between batching and streaming, scaling streaming systems, and real-world applications.
Master Apache Kafka! From core concepts to advanced architecture, learn to create, manage, and troubleshoot Kafka for real-world data streaming challenges!
Neste curso, você aprenderá os fundamentos do Kubernetes e a orquestrar contêineres com kubectl.
Understand how data engineering can impact your business.
