Course
Azure Storage Accounts: Step-by-Step Tutorial for Beginners
Azure Storage forms the backbone of Microsoft’s Azure cloud platform, enabling users to store huge amounts of data securely, reliably, and cost-effectively. It supports various formats, including files, disks, messages, and structured and unstructured data.
Additionally, Azure Storage is highly scalable and integrates with other Azure services, such as the Azure Machine Learning platform.
The purpose of this tutorial is to provide a guide to creating and managing a Storage Account in Azure. By the end of this tutorial, you will be equipped with the knowledge to set up and configure an Azure Storage Account, manage your data effectively, and understand best practices for maintaining and securing Storage Accounts.
Prerequisites
Before diving into the creation and management of an Azure Storage Account, there are a couple of prerequisites to ensure a smooth experience:
- Azure account: To follow along with this tutorial, you must have an active Azure subscription. Each subscription can have up to 250 storage accounts per region.
- Azure Portal familiarity: Having a basic understanding of navigating the Azure Portal will make it easier to follow along.
If you need help setting up your account and want to learn the basics of Azure Portal, check out this guide to setting up and configuring Azure.
Master Data Governance Today
Start from scratch and build vital data governance skills.
Step-by-Step Guide to Creating a Storage Account
To create a Storage Account, you can use the Azure Portal if you prefer a graphical user interface, or opt for PowerShell or the Azure CLI if you prefer scripting.
Using the Azure Portal
1. From the Azure Portal, click “Create a resource.”
2. Go to “Storage,” then select “Storage account.” Alternatively, you can type “storage account” in the search box.
3. Under the “Basics” tab, you’ll find the required settings for the storage account:
Setting up the “Basics” settings for a storage account. Image by author
- Subscription: Choose the Azure subscription you want for this storage account.
- Resource group: Select an existing resource group or create a new one by clicking "Create new" and entering a name.
- Storage Account name: Enter a unique name for your Storage Account. This name must be between 3 and 24 characters in length and can include numbers and lowercase letters only. Typically, the name also includes prefixes or suffixes. It could be a number or an abbreviation referring to a project, department, purpose, or environment.
- Region: Choose the Azure region to host your storage account. Typically, this is the region closest to your location.
- Performance: Select the performance tier (Standard or Premium). Standard is typically sufficient for most use cases.
- Redundancy: Choose the redundancy option that meets your data replication needs. For non-critical data, locally redundant storage (LRS) is sufficient.
4. The following settings are optional but are enjoyable to customize your Storage Account further:
- Networking: Configure network access, set any required network rules or restrictions, and choose public or private endpoints depending on your security requirements.
- Data protection: Set up data protection options like point-in-time container restoration. This feature allows you to restore your data to a previous state at a specific point in time. Another notable feature is the soft delete setting. By enabling soft delete, deleted data is kept for a specific retention period, making it possible to restore accidental or maliciously deleted data.
- Tags: Add tags to organize your Storage Account by applying key-value pairs, like “project: AI chatbot.” This is useful for resource management and billing.
5. Review the configurations and click “Review+Create.” After validation, click “Create” to finalize the storage account's creation. If validation fails, you’ll need to review the settings and correct any incorrect or missing required settings.
Using Powershell
To create a Storage Account using PowerShell, follow these steps:
1. Open the PowerShell console and log in to your Azure account using:
Connect-AzAccount
2. Create a new resource group if you don't already have one:
New-AzResourceGroup -Name “ResourceGroup001”> -Location “EastUS”
3. Create the storage account:
New-AzStorageAccount -ResourceGroupName “ResourceGroup001”> -Name “datastorage” -Location “EastUS” -SkuName “Standard_LRS” -Kind “StorageV2”
This code directly creates a storage account named “datastorage” in the “ResourceGroup001” resource group and the “EastUS” Azure Region.
The SkuName
refers to the performance tier (Standard or Premium) and the redundancy option (in this case, “LRS” or locally redundant storage). You can include additional settings in the same way; for example, you can include tags with the -Tags
option.
You can look up additional commands for storage accounts on the Microsoft website.
4. Verify whether the Storage Account creation by using this code to look up your new account:
Get-AzStorageAccount -ResourceGroupName "ResourceGroup001" -Name "datatstorage"
Using the Azure CLI
To create a Storage Account using the Azure CLI, follow these steps:
1. Open the CLI console and sign in to your Azure account:
az login
2. Create a resource group (if you don't have one already):
az group create --name ResourceGroup001 --location EastUS
3. Create the Storage Account:
az storage account create --name mystorageaccount --resource-group myResourceGroup --location eastus --sku Standard_LRS --kind StorageV2
You can also add additional options for your storage account, such as --tags
.
Check out the DataCamp Azure CLI Cheat Sheet to learn more about using Azure CLI.
4. Check if the Storage Account was successfully created:
az storage account show --name mystorageaccount --resource-group myResourceGroup
Storage Account Advanced Configuration Options
For most storage scenarios, the “General purpose v2 storage” account type is the recommended storage account type. This type provides the latest features for Azure storage, is most cost-effective, and supports all storage types, such as blobs, files, and queues.
In this section, we’ll explore these storage types and their uses in more depth and discuss how to manage costs.
Data types
Each storage account type is tailored to specific storage needs. Understanding the differences between these storage account types and their use cases will help you make informed decisions when setting up and managing your Azure storage solutions.
Here’s an overview of the different storage account data types:
Overview |
Use cases |
|
Blob storage |
Blob storage is specialized for storing unstructured data as blobs (binary large objects) such as documents, videos, images, and backups. |
Suitable for applications that require efficient, scalable storage for large amounts of unstructured data, including content delivery, data archiving, and big data analytics. |
File storage |
File storage provides fully managed file shares in the cloud via the industry-standard SMB protocol. |
This solution is best for scenarios where applications need shared storage accessible from multiple virtual machines, on-premises deployments, and Azure services. |
Queue storage |
Queue storage is designed to store large numbers of messages that can be accessed from anywhere via authenticated calls using HTTP or HTTPS. |
Useful for decoupling application components, where one component generates requests, and another processes them asynchronously. Commonly used in messaging and task processing scenarios. |
Table storage |
Table storage provides a NoSQL key-value store for rapid development and fast access to large amounts of unstructured data. |
Ideal for applications requiring a schema-less design, such as web applications, user data storage, and metadata storage for structured and semi-structured data. |
Pricing
Azure Storage pricing highly depends on the storage tier, data redundancy options, and access patterns.
By understanding these factors and choosing the appropriate options for your data, you can optimize costs while ensuring that your storage solution meets your performance and accessibility needs.
We’ll first discuss the most critical factors that affect pricing for storage accounts in Azure. Then, we’ll dive deeper into two important options affecting performance and pricing: the access tiers and the data redundancy options.
Factors affecting costs
Azure Storage pricing is influenced by several factors, including:
- Storage capacity: The amount of data stored in your storage account. Larger storage amounts generally incur higher costs.
- Storage tiers: The tier chosen (hot, cool, cold, or archive) significantly affects the pricing. Each tier is optimized for different usage patterns and costs.
- Transactions and data retrieval: The number of operations performed on the data, such as read, write, and delete. Higher transaction volumes can lead to increased costs.
- Data redundancy options: The replication strategy chosen (LRS, ZRS, GRS, GZRS) impacts the cost. Higher redundancy levels provide greater data durability and availability but at a higher cost.
- Outbound data transfers: Data transferred out of Azure regions incurs additional costs. This is especially relevant for applications with significant data egress to external systems or users.
Comparison between storage access tiers
Azure Storage offers four access tiers: hot, cool, cold, and archive. Choosing the proper access tier for your access patterns helps optimize costs.
Here’s an overview of the different tiers, cost and use cases:
Overview |
Cost |
Use cases |
|
Archive tier |
Optimized for data that is rarely accessed and stored for at least 180 days |
Lowest storage costs but highest retrieval costs and latency |
Best for data that is rarely needed, such as long-term archival storage, compliance, and historical data retention |
Cold tier |
Optimized for data that is infrequently accessed and stored for at least 90 days |
Higher storage costs than the archive tier but lower retrieval costs and latency |
Best for older data that isn’t used frequently but is expected to be available for immediate access |
Cool tier |
Optimized for data that is infrequently accessed and stored for at least 30 days |
Higher storage costs compared to the cold tier but lower access and transaction costs |
Suitable for data that is not accessed regularly but needs to be available when needed, such as backups, disaster recovery data, and long-term business data storage |
Hot tier |
Optimized for data that is accessed frequently |
Highest storage costs but lowest access and transaction costs |
Ideal for data that requires fast, frequent access, such as active datasets, user files, and live applications |
Data redundancy options
Data redundancy in Azure refers to replicating data across different locations to ensure its durability, availability, and accessibility, even during hardware failures, network issues, or disasters.
Each redundancy option offers different levels of protection, availability, and cost. These options ensure your data is safe and can be recovered or accessed from alternative locations when necessary.
The following table sorts the different options from low cost to high cost:
Overview |
Use cases |
|
Locally redundant storage (LRS) |
LRS replicates your data three times within a single data center in a region |
Suitable for scenarios where data loss can be tolerated, but high availability within a single region is needed |
Zone-redundant storage (ZRS) |
ZRS replicates your data synchronously across three storage clusters in a single region, each located in different availability zones |
Suitable for scenarios requiring high availability and durability within a region, protecting against data center-level failures |
Geo-redundant storage (GRS) |
GRS replicates your data to a secondary region hundreds of miles from the primary location. It combines LRS in the primary region and asynchronous replication to the secondary region |
Suitable for disaster recovery scenarios where data needs to be protected against regional outages |
Geo-zone-redundant storage (GZRS) |
GZRS combines the benefits of ZRS and GRS. It synchronously replicates data across three Azure availability zones in the primary region and asynchronously replicates it to a secondary geographic region. |
Suitable for mission-critical applications requiring high availability, durability, and disaster recovery capabilities |
If you’re interested in more in-depth knowledge about Azure, check out the Azure Architecture and Services course.
Lifecycle management
In addition to choosing the right options for your storage account, lifecycle management in Azure storage allows you to automate data movement between different access tiers to optimize costs.
Here's how to set up policies for automated data movement:
The lifecycle management overview screen of a storage account. Image by author
1. Go to your storage account in the Azure portal. Under the “Data management” section, select “Lifecycle management.”
2. Click on “Add rule” to create a new lifecycle management policy. Provide a name for the rule and specify the conditions under which the rule should apply.
3. Define conditions and actions:
- Set conditions based on the age of the data or the last accessed date.
- Define actions such as moving data to a cooler tier or deleting it after a specified period.
- For example, you can set a rule to move blobs to the cool tier if they haven’t been modified for 30 days, to the cold tier after 90 days, and then to the archive tier if they haven’t been accessed for 180 days.
4. Review the rule settings and save the policy. Based on the defined conditions, the policy will automatically apply to the blobs in the storage account.
Lifecycle management example scenarios
Here are some common scenarios data teams may encounter and possible lifecycle management strategies for each.
1. Compliance and retention:
- Scenario: Regulatory requirements mandate that specific data must be retained for several years but is rarely accessed.
- Strategy: Store data in the Cold tier for the first year and then move it to the Archive tier for long-term retention. This approach ensures compliance while minimizing storage costs.
2. Application data:
- Scenario: An application generates logs that are important for the first month but are only sometimes accessed after that. However, they are still expected to be available quickly.
- Strategy: Set a policy to move logs from the hot to cool tier after 30 days. This keeps storage usage and costs in check while retaining necessary data for the required duration.
3. Development and testing data:
- Scenario: Development teams generate large volumes of data during software development and testing phases, which are heavily accessed during the development cycle but become less relevant over time.
- Strategy: Store development and test data in the hot tier during active development. When active development slows down, move data to the cold tier after 90 days and then to the archive tier after 180 days, where it can be retrieved for audit or reference.
Conclusion
Azure Storage accounts offer scalable, secure, and flexible data storage options. In this tutorial, we discussed how to create, manage, and optimize storage accounts using Azure Portal, PowerShell, and Azure CLI. We also covered advanced configuration options, pricing, and cost-saving strategies.
If you’d like to learn more about the fundamental capabilities of Azure, check out DataCamp’s Azure Fundamentals track.
FAQs
Can I use Terraform to create an Azure Storage Account, and how does it help with infrastructure management?
You can use Terraform to create and manage Azure Storage Accounts as part of your infrastructure-as-code strategy. Terraform allows you to define your cloud resources in a declarative configuration file, which makes it easier to automate and manage infrastructure at scale. By using Terraform, you can ensure that your Azure Storage Accounts and other resources are consistently deployed and configured according to your specifications, and it also enables version control and collaboration across teams.
How do I obtain the access keys for an Azure Storage Account, and why are they important?
Access keys are crucial for authenticating and accessing your Azure Storage Account data. You can obtain these keys through the Azure Portal, where you can navigate to your storage account and find the keys under the "Access keys" section. These keys provide programmatic access to your storage account, so it's important to handle them securely and rotate them periodically to maintain security. Additionally, you can use shared access signatures (SAS) as a more granular and secure method of granting temporary access to your storage resources.
Can I change my Azure Storage Account's performance tier or redundancy option after it has been created?
You can change the performance tier and redundancy option after creating your Azure Storage Account. To do this, navigate to your storage account settings in the Azure Portal and update the performance tier or redundancy setting under the relevant section. Keep in mind that changing these options may temporarily affect access to your data and, depending on the changes, could incur additional costs.
What are the best practices for securing data stored in an Azure Storage Account?
To secure data in an Azure Storage Account, consider the following best practices:
- Enable encryption at rest using Azure Storage Service Encryption (SSE).
- Use network security measures like private endpoints and virtual networks.
- Implement role-based access control (RBAC) to limit access to your storage account.
- Enable and configure Azure Defender for Storage to detect threats.
- Regularly rotate account keys and use shared access signatures (SAS) to control access.
How can I monitor the performance and usage of my Azure Storage Account?
Azure Monitor allows you to monitor the performance and usage of your Azure Storage Account. It provides transaction rates, latency, and capacity utilization metrics. Additionally, you can set up alerts to notify you of unusual activity or when usage thresholds are exceeded. Logs and diagnostic data can also be collected for more detailed analysis.
Become a Data Engineer
Anneleen is a data scientist with a background in statistics and social sciences. She currently works as a freelance data scientist in finance and is studying a postgraduate degree in Applied AI. Anneleen is the instructor of four DataCamp courses including 'Azure Management and Governance'.
Learn more about Azure and cloud computing with these courses!
Course
Understanding Cloud Computing
Course
Azure Architecture and Services
cheat-sheet
Azure CLI Cheat sheet
tutorial
How to Set Up and Configure Azure: Beginner's Guide
tutorial
Azure SQL Database: Step-by-Step Setup and Management
Anneleen Rummens
25 min
tutorial
Azure Synapse: A Step-by-Step Beginner’s Guide
tutorial
A Beginner's Guide to Azure Machine Learning
tutorial