Table of Contents

Introduction

Running your application in a cloud environment offers numerous advantages, including enhanced security, scalability, and cost-efficiency. To further optimize your cloud expenditure, Microsoft Azure introduces a feature called Spot Instances or Spot Virtual Machines. In this article, let us explore the key features of Azure Spot Instances, its pricing details, and factors to consider while running an Azure Virtual Machine.

Understanding Azure Spot Instances

Azure Spot Instances serve as the counterpart to AWS Spot Instances and GCP Spot VM. It is a pricing model designed to accommodate fault-tolerant, flexible workloads at a discounted price compared to the pay-as-you-go standard pricing model. Unlike traditional cloud instances, Azure Spot Instances provide access to surplus Azure compute capacity at a discount of up to 90%, enabling businesses to achieve substantial cloud cost savings while maintaining performance and scalability.

It is important to note that Azure Spot Instances are allocated based on capacity. However, there are no service level agreements (SLAs) provided for them. Spot Virtual Machines may be evicted by the Azure infrastructure when capacity is needed elsewhere. However, they are suitable for workloads that can handle interruptions, as long as users are prepared to deal with potential evictions.

The availability of capacity for Azure Spot instances can vary based on factors such as instance size, region, and time of day. When Azure needs to reclaim capacity, Azure Spot instances are evicted with a 30-second notice. Users must be prepared to gracefully handle these evictions and ensure that their applications can recover or handle these interruptions.

Azure spot instance configuration for interruptible workload with Eviction policy
Source: Microsoft Azure

Unlike traditional VMs, Spot VMs have no priority to access compute capacity, and they are not guaranteed availability after accessing that capacity. It is used for applications that require minimal to no time constraints, low organizational priority, and short processing times.

Azure Spot instances are ideal for workloads that can tolerate interruptions, such as batch processing jobs, development/testing environments, and large compute workloads.

Key features of Azure Spot VM

Azure Spot Instances are beneficial for organizations looking to optimize their cloud expenditure while maintaining performance and scalability. Some of the key features of Azure Spot VMs are:

Cost Savings: Azure Spot instances offer access to unused Azure compute capacity at significantly discounted rates compared to the regular pay-as-you-go pricing model. Spot VMs offer potential discounts of up to 90% making it an ideal, cost-optimized solution for running fault-tolerant workloads on Azure cloud infrastructure.

Eviction Policy: Azure Spot instances come with an eviction policy. This eviction policy determines what happens to a Spot VM instance when Azure reclaims the resources. VMs can be evicted based on capacity or the maximum price you set. The users can set the eviction policy as Deallocate or Delete.

The Deallocate policy(Default) moves your VM to the stopped-deallocated state, allowing you to redeploy it later. If you want your VM to be deleted, you can set the eviction policy to delete.

Graceful Shutdown when Evicted: Regardless of the Eviction policy, you’ll receive a 30-second notification before a Spot VM eviction occurs. This allows you to save your work before eviction and gracefully shut down your application or perform any necessary data storage, backup, or cleanup tasks before the VM is powered off or deleted.

Dynamic Pricing: The dynamic pricing model of Azure Spot Instances adjusts prices based on supply and demand dynamics, allowing businesses to capitalize on cost-saving opportunities during periods of low demand.

Additionally, users can set a maximum price they’re willing to pay for a Spot VM. This helps to manage costs and avoid unexpected charges if Spot VM prices fluctuate.

Broad VM Selection: Spot VMs are available across a wide range of VM types and sizes, catering to diverse workloads requiring different compute resources. Users can choose from over 500 different instance types to provision their resources.

No Service Level Agreements (SLAs): There are no guaranteed uptime or performance metrics associated with Spot VMs. They are best suited for workloads that can tolerate interruptions without significant impact.

Pricing comparison for Azure Spot Instances

The pricing of Azure Spot Instances is mainly based on supply and demand dynamics. Prices may vary based on factors such as region, instance type, and SKU (Stock Keeping Unit). Azure continuously adjusts spot prices in real-time based on the availability of surplus capacity. Users can query the spot pricing data using Azure retail prices API, by filtering “meterName” and “skuName” values as “Spot.”

The following table shows the comparison between the regular cost and spot VM cost of a few instance types in Microsoft Azure.

InstancevCPU(s)/
Core(s)
RAMPay-as-you-go pricing Spot pricingSavings
B2pts v221 GiB$7.30/month$1.82/month75% savings
A1 v212 GiB$31.39/month$3.14/month90% savings
EC96ads v596672 GiB$5,185.92/month$3,630.14/month30% savings
ND96amsr A100 v4961,900 GiB$31,098.73/month$4,086.37/month87% savings
D2as v428 GiB$81.76/month$8.18/month90% savings
Azure Spot Instance Pricing vs Standard Pricing

With variable pricing, users can set a maximum price in US dollars (USD), with the option to specify up to five decimal points for precision. For example, setting a max price of 0.98765 USD would limit the cost to $0.98765 per hour.

Alternatively, if users set the max price to -1, the VM won’t be evicted based on price. Instead, the price for the VM will be the current spot price or the price for a standard VM, whichever is lower, as long as there is capacity and quota available.

Factors to Consider while Building Workloads on Azure Spot VMs

Use the most flexible eviction policy

In Azure Spot Instances, the eviction policy selected affects its orchestration or the way the eviction is handled. There are two eviction policies: Deallocate and Delete. A delete eviction policy is more flexible than a stopped/deallocated eviction policy.

Delete is the recommended eviction policy as it allows the orchestration to deploy replacement spot VMs to new zones and regions. This helps your workload to find spare compute capacity faster than a stopped/deallocated VM.

For a stopped/deallocated policy, the spot VMs must stay in the same region and zone, and you need to reallocate the VM when compute capacity becomes available. As it is unable to predict when compute capacity will be available, it is recommended to use an automated schedule pipeline to attempt a redeployment after an eviction.

Plan for immediate, multiple simultaneous evictions

Spot VMs have no availability guarantees (SLAs) after creation. Incorporating VM health checks into the orchestration helps you prepare for immediate evictions. The health checks need to reside outside the work environment, monitor the status of the spot VM, and start the deployment pipeline to replace the spot VM when the status changes to deallocating or stopping.

While running a cluster of spot instances, it is advisable to design the workload to withstand multiple simultaneous evictions. A simultaneous eviction of multiple VMs could affect the throughput of the application. To avoid this situation, your deployment pipeline should be able to gather signals from multiple VMs and deploy multiple replacement VMs simultaneously.

Use multiple VM sizes and locations

Building an orchestration to support multiple VM types and sizes increases flexibility. Azure has different VM types and sizes that provide similar capabilities for around the same price. This approach provides diverse options for replacing evicted VMs

Azure offers a range of VM types and sizes with similar capabilities and pricing, allowing users to select suitable alternatives based on criteria such as minimum vCPUs/Cores, minimum RAM, and maximum price. Each VM type is associated with an eviction rate expressed as a percentage range(0-5%, 5-10%, 10-15%, 15-20%, 20+%), which may vary across regions.

Design an idempotent workload

Idempotent refers to a property in which an operation or function applied multiple times yields the same result. An idempotent workload ensures reliability and consistency in case of an eviction. A forced shutdown can terminate processes before completion.

By adopting an idempotent design, workloads can effectively handle scenarios where messages are received more than once, ensuring consistent outcomes regardless of interruptions or evictions.

Use an application warmup period


Many interruptible workloads involve running applications, which require time for installation, booting, connection to external storage, and retrieval of information from checkpoints. It is advisable to incorporate an application warmup period before initiating processing tasks.

Azure spot instances application warm up period

During this warmup period, the application should undergo booting, establishment of connections, and preparation to contribute to the workload. Once the health of the application has been validated it is permitted to commence processing data. This approach ensures a smoother and more reliable operation of the workload, minimizing potential disruptions and errors.

Continuously monitor for eviction

Workloads running on Azure spot instances lack Service Level Agreements (SLAs) and can be evicted unpredictably. Continuous monitoring is required to enhance workload reliability and handle eviction events proactively.

Scheduled Events service for each VM allows for timely notifications before a VM is evicted and sends out a Preempt signal at least 30 seconds prior to eviction. This signal is received by a service called Schedule Events. By querying the Scheduled Events endpoint up to every second, you can orchestrate a graceful shutdown.

Conclusion

Azure Spot Instances offer a cost-effective solution for utilizing your surplus Azure compute capacity, enabling organizations to optimize their cloud expenditure while maintaining performance and scalability. Despite the potential for interruptions due to eviction events, proactive monitoring and orchestration strategies can enhance workload reliability and minimize disruptions. With careful planning and implementation of best practices, Azure Spot Instances can serve as valuable resources for driving efficiency and cost savings in cloud computing environments.

How can we help?

Are your cloud bills reaching sky-high levels? Don’t let cloud costs weigh you down anymore. With Economize, you can slash your cloud expenditures by up to 30% effortlessly. Book a free demo with us today and discover how we can help you start saving in as little as 10 minutes.

Heera Ravindran

Content Marketer at Economize. An avid writer and a zealous reader who specializes in technical content and has a passion for all things Cloud and FinOps.