Saltar al contenido principal

AWS, Azure and GCP Service Comparison for Data Science & AI

This cheat sheet provides a comparison of the main services needed for data and AI-related work, from data engineering to data analysis and data science, to creating data applications.
19 jun 2023  · 17 min de lectura

Have this cheat sheet at your fingertips

Download PDF

Cloud computing eliminates the capital expenditure of building and maintaining data centers, enabling businesses to access and pay for only the resources they use. Its scalable nature allows for quick adjustment to changing business needs. Mirroring data simplifies data recovery and business continuity. By providing access to resources from anywhere, cloud computing also supports remote work and collaboration.

The big three public clouds - Amazon Web Services, Microsoft Azure, and Google Cloud Platform - have hundreds of services, and it can be hard to determine what you need for any given project.

This cheat sheet provides a comparison of the main services needed for data and AI-related work, from data engineering to data analysis and data science, to creating data applications.

Storage

Service type

Description

AWS

Azure

GCP

Object storage

For storing any files you regularly use

Simple Storage Service (S3)

Blob Storage

Cloud Storage Buckets

Archive storage

Low cost (but slower) storage for rarely used files

S3 Glacier Instant, Glacier Flexible, Glacier Deep Archive tiers

Blob Cool/Cold/Archive tiers

Cloud Storage Nearline, Coldline, Archive tiers

File storage

For storing files needing hierarchical organization

Elastic File System (EFS), FSx

Avers vFXT, Files

Filestore

Block storage

For storing groups of related files

Elastic Block Storage

Disk Storage

Persistent Disk

Hybrid storage

Move files between on-prem & cloud

Storage Gateway

StorSimple, Migrate

Storage Transfer Service

Edge/offline storage

Move offline data to the cloud

Snowball

Data Box

Transfer Appliance

Backup

Prevent data loss

Backup

Backup

Backup and Disaster Recovery

Database

Service type

Description

AWS

Azure

GCP

Relational DB management

Standard SQL DB (PostgreSQL, MySQL, SQL Server, etc.)

Relational Database Service (RDS), Aurora

SQL, SQL Database

Cloud SQL, Cloud Spanner

         

NoSQL: Key-value

Redis-like DBs for semi-structured data

DynamoDB

Cosmos DB, Table storage

Cloud BigTable, Firestore

NoSQL: Document

MongoDB/CouchDB-like DBs for hierarchical JSON data

DocumentDB

Cosmos DB

Firestore, Firebase Realtime Database

NoSQL: Column store

Cassandra/HBase-like DBs for structured hierarchical data

Keyspaces

Cosmos DB

Cloud BigTable

NoSQL: Graph

Neo4j-like DBs for connected data

Neptune

N/A

N/A

Caching

Redis/Memcached-like memory for calculations

ElastiCache

Cache for Redis, HPC Cache

Memorystore

Time Series DB

DB tuned for time series data

Timestream

Time Series Insights

Cloud BigTable

Blockchain

Dogecoin, etc.

Managed Blockchain

Blockchain Service, Blockchain Workbench, Confidential Ledger

N/A

Compute

Service type

Description

AWS

Azure

GCP

Virtual machines

Software-emulated computers

Elastic Compute Cloud (EC2)

Virtual Machines

Compute Engine

Spot virtual machines

Cost-effective VMs

EC2 Spot Instances

Spot Virtual Machines

Spot VMs

Autoscaling

Adjust resources to match demand

EC2 Auto Scaling

Virtual Machine Scale Sets

Instance Groups

Functions as a service (Serverless computing)

Execute code chunks without worrying about infrastructure

Lambda

Functions

Cloud Functions

Platform as a service

Manage applications without worrying about infrastructure

Elastic Beanstalk, Red Hat OpenShift on AWS

App Service, Cloud Services, Spring Cloud, Red Hat OpenShift

App Engine

Batch scheduling

Run code at specified times

Batch

Batch

Batch, Cloud Scheduler

Isolated servers

VM on your own machine, for high security

Dedicated Instances

Dedicated Host

Sole-tenant Nodes, Shielded VMs

On-premise/Edge devices

Cloud-services on your own hardware

Outposts, Snow Family

Modular Datacenter, Stack Hub, Stack HCI, Stack Edge

N/A

Quantum computing

Determine if cat is alive or dead

Braket

Quantum

N/A

Analytics

Service type

Description

AWS

Azure

GCP

Data Warehouse

Centralized platform for all your data

RedShift

Synapse Analytics

BigQuery

Big data platform

Run Spark, Hadoop, Hive, Presto, etc.

EMR

Data Explorer, HDInsight

Dataproc

Business analytics

Dashboards and visualization

Quicksight, FinSpace

Power BI Embedded, Graph Data Connect

Looker, Looker Studio, Vertex AI Workbench

Real-time analytics

Streaming data analytics

Kinesis Data Analytics, Kinesis Data Streams, Managed Streaming for Kafka

Stream Analytics, Event Hubs

Dataflow, Pub/Sub, Datastream

Extract-Transform-Load (ETL)

Preprocessing and importing data

Glue, Kinesis Data Firehose, SageMaker Data Wrangler

Data Factory

Data Fusion, Dataflow, Dataproc,

Dataprep by Trifacta

Workflow orchestration

Build data and model pipelines

Data Pipeline, Managed Workflows for Airflow

Data Factory

Cloud Composer

Data lake creation

Import data into a lake

Lake Formation

Data Share

Cloud Storage

Managed search

Enterprise search

CloudSearch, OpenSearch Service, Kendra

Cognitive Search

Cloud Search

Data Catalog

Metadata management

Glue Data Catalog

Purview, Data Explorer

Data Catalog

ML & AI

Service type

Description

AWS

Azure

GCP

Machine Learning

Train, fit, validate, and deploy ML models

SageMaker

Machine Learning

Vertex AI

Jupyter notebooks

Write data analyses and reports

SageMaker Notebooks

Notebooks

Colab

Data science/machine learning VM

Virtual machines tailored to data work

Deep Learning AMIs

Data Science Virtual Machines

Deep Learning VM

AutoML

Automatically build ML models

SageMaker

Machine Learning Studio,

Automated ML

Vertex AI Workbench

Natural language Processing AI

Analyze text data

Comprehend

Text Analytics

Natural Language AI

Recommendation AI

Product recommendation engine

Personalize

Personalizer

Recommendations AI

Document capture

Extract text from printed text & handwriting

Textract

Form Recognizer

Document AI

Computer vision

Image classification, object detection & other AI with image data

Rekognition, Panorama, Lookout for Vision

Cognitive Services for Vision

Vision AI

Speech to text

Speech transcription

Transcribe

Cognitive Services for Speech to Text, Cognitive Services for Speaker Recognition

Speech-to-Text

Text to speech

Speech generation

Polly

Cognitive Services for Text to Speech

Text-to-Speech

Translation AI

Convert text between human languages

Translate

Cognitive Services for Speech Translation, Translator

Translation AI

Video Intelligence

Video indexing and asset search

Rekognition Video

Video Indexer

Video Intelligence API

AI agents

Virtual assistants and chatbots

Lex, Alexa Skills kit

Bot Service, Cognitive Services for Conversational Language Understanding

Dialogflow

Human-in-the-loop

Human-based quality control for AI

Augmented AI (A2I)

Cognitive Services Content Monitor

N/A

Networking & ​​Content Delivery

Service type

Description

AWS

Azure

GCP

Content delivery network

Serve content to users

CloudFront

Content Delivery Network

Cloud CDN and Media CDN

Application Programming Interface (API) management

Build and deploy APIs

API Gateway

API Apps, API Management

Apigee API Management

Domain Name System (DNS)

Route end users to applications

Route 53

DNS

Cloud DNS

Load balancing

Distribute work evenly across machines

Elastic Load Balancing (ELB)

Application Gateway, Load Balancer, Traffic Manager

Cloud Load Balancing

Containers

Service type

Description

AWS

Azure

GCP

Managed containers

Run and deploy containers

Elastic Kubernetes Service, Elastic Container Service

Kubernetes Service, Container Apps

Kubernetes Engine

Container registration

Manage container images

Elastic Container Registry

Container Registry

Artifact Registry

Management & Security, Identity

Service type

Description

AWS

Azure

GCP

Access management

User permissions and authentication

Identity and Access Management (IAM)

Entra ID

Cloud Identity

Activity tracking

Track user Activity

CloudTrail

Monitor Activity Log

Access Transparency and Access Approval

Security

Protect your data, network and applications

Security Hub

Security

Security Command Center

Monitoring

Monitor network traffic and detect anomalies

CloudWatch, Transit Gateway Network Manager

Monitor, Anomaly Detector

Operations, Network Intelligence Center

Automation

Preform processes automatically

OpsWorks

Automation

Compute Engine Management

Cost optimization

Reduce your cloud spend

Cost Optimization

Cost Management

Recommender

Temas
Relacionado

blog

AWS vs Azure: An In-Depth Comparison of the Two Leading Cloud Services

Explore the key differences and similarities between Amazon Web Services (AWS) and Microsoft Azure. This comprehensive analysis covers performance, pricing, service offerings, and ease of use to help aspiring practitioners determine which cloud computing is better suited for their needs.
Kurtis Pykes 's photo

Kurtis Pykes

12 min

blog

Google Cloud for Data Scientists: Harnessing Cloud Resources for Data Analysis

How can using Google Cloud make data analysis easier? We explore examples of companies that have already experienced all the benefits.
Oleh Maksymovych's photo

Oleh Maksymovych

9 min

blog

Azure Data Factory vs Databricks: A Detailed Comparison

Discover the differences between Azure Data Factory and Databricks, two leading tools for data integration, analytics, and machine learning. Learn when and how to use them!
Gus Frazer's photo

Gus Frazer

25 min

blog

What is Google Cloud Platform (GCP)? A Comprehensive Guide to Mastering Cloud Services

Learn what Google Cloud Platform is, from cloud basics to advanced analytics and AI. Become an expert in GCP core features and strategic advantages.
Jana Barth's photo

Jana Barth

16 min

cheat-sheet

Azure CLI Cheat sheet

With this Azure CLI cheat sheet, you'll have a handy reference guide to executing commands to create, manage, and deploy resources like virtual machines, databases, and storage accounts.
Richie Cotton's photo

Richie Cotton

9 min

cheat-sheet

Machine Learning Cheat Sheet

In this cheat sheet, you'll have a guide around the top machine learning algorithms, their advantages and disadvantages, and use-cases.
Richie Cotton's photo

Richie Cotton

8 min

See MoreSee More