Skip to main content

A Comprehensive Guide to Using Azure Spot Instances for Cost Reduction

Explore how Azure Spot Virtual Machines work, how they can significantly reduce your cloud costs, and discover strategies for their practical implementation.
Jul 12, 2024  · 10 min read

Cloud computing is useful for Data Science and AI applications. There are many cloud providers, and choosing the right one is crucial for our application. However, a big challenge in cloud computing is managing costs. When we deploy applications and workloads to the cloud, we need to be aware of the costs, as the machines operating in the cloud can be very expensive. 

In this article, we will explore Azure Spot Virtual Machines (VMs), also called Azure Spot Instances. These machines allow us to save costs when deploying to Microsoft Azure. If you are new to Microsoft Azure and need an introduction, refer to our Introduction to Azure course. For more advanced users, our Azure Architecture and Services course would be useful. Finally, if you are looking for a quick reference, our simple sheet comparing the different cloud providers can be found here: AWS, Azure and GCP Service Comparison for Data Science and AI.  

Let’s dive into Azure Spot VMs.

What are Azure Spot VMs?

Azure Spot VMs are spare virtual machines within Azure's data centers at any given time. Because these machines are not being used currently, Azure offers them at significantly reduced rates, sometimes up to a discount of 90% compared to the standard price of Azure VMs. 

This cost reduction, however, comes with uncertainty. If there is an increase in the demand for VMs from other users, Azure can reclaim these Spot VMs at any time with little notice. This makes Spot VMs very unpredictable. As a result, we must use Spot VMs only for tasks that don't require constant availability and that can afford to be interrupted. Some such tasks include:

Batch jobs

Batch jobs are tasks that we can run on a computer without manual intervention. These tasks are typically processed in large groups or “batches”, either sequentially or in parallel. These jobs often have flexible start and end times, and therefore, they can be paused and resumed without significant disruption. 

For example, consider a financial institution that needs to process transactions at the end of each business day. They can set up a batch job to automatically load all the thousands or millions of transactions from various sources, validate them, and then update account balances accordingly. 

Test and development environments

Test and development environments are environments where software developers can build and test applications without affecting the live production environment. Setting up these environments can be costly, and this is where Spot VMs can be used. 

For example, imagine a team working on an e-commerce website. They want to try a new feature on the website that recommends products based on user browsing history. They can develop this feature in their development environment and then move it to a test environment, to rigorously test for bugs, performance issues, and user experience. This can be done in Spot VMs for different settings. After successful testing, they can deploy the feature to the production environment, making it available to customers.

Fault-tolerant applications

Fault-tolerant applications are applications that can continue operating without interruption even when certain components fail. This can be achieved with built-in redundancy and failover mechanisms. Such applications are well-suited for Spot VMs. An example of such an application might have several servers that are handling user requests. If one server fails, the system automatically reroutes requests to another server. Because such operations are fault-tolerant, they can be used on Spot VMs to save costs.

Benefits of Azure Spot VMs

The main benefit of using Azure Spot VMs is the substantial cost savings. As we mentioned, this can be up to 90% discount compared to on-demand VM prices. As a result of this discount, we can scale workloads up or down based on the available capacity, allowing us to run large-scale operations that would otherwise be very expensive with on-demand VMs. Furthermore, with Spot VMs, we can set the maximum price we are willing to pay for the VM. This gives us control over our spending. 

When the VM will be reclaimed by Azure, we can also select an eviction policy to specify what happens next - we can either deallocate or delete our VM. If we select to deallocate the VM, it will stop but not delete the VM. This way, we can restart the VM when it will be available again. In the meantime, we will only pay for the storage costs. If we select to delete our VM, all our data and workloads will be deleted. 

How to Deploy a Spot VM

In this section, we explain how to deploy a Spot VM and select the options associated with it, such as pricing and eviction policy. We can deploy either a single Spot VM or a group of Spot VMs, known as Spot VM Scale Set.

Single Spot VMs 

The easiest way is to create a VM from the Azure portal. 

  1. Select Virtual Machines under Services in the portal and then select Create
  2. In the Create a virtual machine page, fill out all the details and select the check box Run with Azure Spot discount
  3. We then need to select the Eviction type and Eviction Policy.  
  4. For the Eviction type, we can either select Capacity only or Price or capacity. Selecting Capacity only evicts our VM when Azure’s VMs are in high demand. This option sets the maximum price we are willing to pay for the Spot VM. To reduce the maximum price instead, we can select Price or Capacity. We then need to fill out the maximum price we are willing to pay. For this option, Azure evicts the VM either when Azure's VM are in high demand or when the cost of a Spot VM exceeds the specified max price.
  5. We also need to select an eviction policy - this is what would happen if Azure evicts our machine. We can either select Deallocate or Delete. Selecting Deallocate stops the VM but does not delete the data or applications on the VM. We won't be charged for the VM while it is stopped, however we will be charged for any associated storage. This option is suitable if we want to restart the VM when capacity is available again. If we select Delete instead, the VM and any associated temporary storage are deleted when Azure reclaims the instance. 

Creating a virtual machine in Microsoft AzureCreating a virtual machine. Source: Image by Author

We can also use Azure CLI, and Azure PowerShell, amongst other methods, to start a Spot VM. For more details, refer to the Azure documentation on Spot VMs

Spot Virtual Machine Scale Sets (VMSS) 

We can also create groups of Spot VMs known as Spot Virtual Machine Scale Sets. To do this, search for Virtual Machine Scale Sets on the Azure portal and create one. VMSS allows us to automate the deployment and management of multiple Spot VMs, making it easier to scale in and out as needed. We can define criteria for scaling, such as CPU usage or network traffic, and set a desired capacity based on our maximum price and availability requirements.

Best Practices for Azure Spot VMs

We recommend these best practices when using Spot VMs to ensure that our workloads remain resilient and cost-effective.

Check capacity

Before deploying Spot VMs, I check the available capacity in my desired region and for the VM sizes I plan to use. Azure's capacity fluctuates based on demand from other customers, so verifying availability can help us make informed decisions about deploying our workloads.

Set maximum price

I set the maximum price I’m willing to pay per hour for a VM. This helps control costs.

Design for interruptions

Given that Spot VMs can be preempted at short notice, I recommend designing workloads to be resilient to interruptions. Use strategies such as checkpointing to periodically save the work, thereby allowing one to resume from the last checkpoint rather than starting over. For distributed processing workloads, ensure that tasks can be redistributed among remaining instances to maintain processing throughput even if some instances are preempted.

Set up monitoring

Effective monitoring is another strategy that is crucial for managing Spot VMs efficiently. Azure provides tools to track the usage, costs, and eviction events of Spot VMs. By monitoring these metrics, we can gain insights into the performance of our Spot VMs, which will allow us to adjust our strategy as needed. 

Azure Spot VMs vs. Reserved Instances

Sometimes, using a reserved VM could be more reasonable than using a spot VM, depending on the application. Reserved VMs are another kind of VM that also allows us to optimize our cloud costs. This section explains the differences between these two instances.

Spot VMs

As we have already seen, Spot VMs provide significant cost savings, sometimes, up to 90% compared to on-demand pricing. However, these instances can be reclaimed by Azure with little notice depending on the demand for capacity. As a result, we should use spot instances for workloads that are flexible in terms of timing and that can handle interruptions. This includes batch processing jobs, development and testing environments, and stateless applications that can be easily distributed across multiple instances. 

Reserved instances

Reserved Instances, on the other hand, are purchased for either a one-year or three-year term. We can select a specific VM type with a significant discount over on-demand pricing in exchange for the commitment of using it for a year or three years. The main benefit of using Reserved Instances is that there is no risk of eviction from Azure. If the workload has predictable usage patterns and requires constant availability, Reserved Instances are likely the better option.

In summary, the choice between Azure Spot VMs and Reserved Instances comes down to the specific needs regarding cost, flexibility, and reliability.

Conclusion

Azure Spot VMs are a great solution for achieving cost-effective cloud computing. They can provide us with substantial discounts, sometimes up to 90% compared to on-demand pricing. 

Azure Spot VMs are particularly useful for workloads that are flexible and can tolerate interruptions. These include batch processing jobs, development and testing environments, or any application that doesn't require constant uptime. For these workloads, Spot VMs can help us reduce costs significantly.

To learn more about Azure Spot VMs, check out these resources:

  • Azure Documentation: The official Azure documentation is an excellent starting point, offering detailed guides, best practices, and technical references on using Spot VMs.
  • Community Forums and Support: The Azure community forums and support channels are valuable resources for getting answers to specific questions and learning from the experiences of other Azure users.

If you enjoyed this article, check out these other posts: Azure DevOps Interview Questions and Using GPT on Azure.


Anirudh Kulkarni's photo
Author
Anirudh Kulkarni
LinkedIn

I am a technical writer at MathWorks. My work involves editing and writing documentation related to MathWorks products on Amazon Web Services, Microsoft Azure, Docker, and NVIDIA.

Topics

Learn with DataCamp

course

Understanding Cloud Computing

2 hr
84.2K
A non-coding introduction to cloud computing, covering key concepts, terminology, and tools.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

AWS vs Azure: An In-Depth Comparison of the Two Leading Cloud Services

Explore the key differences and similarities between Amazon Web Services (AWS) and Microsoft Azure. This comprehensive analysis covers performance, pricing, service offerings, and ease of use to help aspiring practitioners determine which cloud computing is better suited for their needs.
Kurtis Pykes 's photo

Kurtis Pykes

12 min

tutorial

AWS EC2 Tutorial For Beginners

Discover why you should use Amazon Web Services Elastic Compute Cloud (EC2) and how you can set up a basic data science environment on a Windows instance.
DataCamp Team's photo

DataCamp Team

7 min

tutorial

Getting Started with Azure Monitor: Key Features and Best Practices

Learn key features and best practices for Azure Monitor to optimize resource usage and costs, ensure compliance, and resolve performance or security issues.
Anneleen Rummens's photo

Anneleen Rummens

11 min

tutorial

A Beginner's Guide to Azure Machine Learning

Explore Azure Machine Learning in our beginner's guide to setting up, deploying models, and leveraging AutoML & ML Studio in the Azure ecosystem.
Moez Ali's photo

Moez Ali

11 min

tutorial

How to Set Up and Configure Azure: Beginner's Guide

Learn how to set up and configure Azure with this beginner's guide. Follow easy steps to navigate the Azure portal, create resources, and manage your cloud services.
Florin Angelescu's photo

Florin Angelescu

8 min

code-along

Using GPT on Azure

Find out how you can spin up a pay-as-you-go instance of OpenAI, and you’ll build a few simple applications that show the power of LLMs and GenAI against your data.
Dave Wentzel's photo

Dave Wentzel

See MoreSee More