AWS EMR - Instance, Auto Scaling and Spot Configurations
When an EMR Cluster is created, some of the configurations to be opted and answered are what type of node configuration is needed, do we need auto scaling and how about spot instances. These options are inter related and can be sometimes confusing. This article is an attempt to explain these options.
It is about choosing what type of instance you need for the master, task and core nodes of the cluster. There are two options here.
Uniform Instance Groups
- Default and simplest option.
- Can have maximum of 50 instance groups,one for master, one for core instance group and up to 48 optional task instance groups.
- Each instance group contains nodes of same instance types, they cannot be changed once it is created.
- Additional task instance groups can be created with different instance types.
- Offers the widest variety of provisioning options of node configurations.
- Each node type has a single instance fleet and the task instance fleet is optional. For each instance fleet we can specify up to five instance types (like R5.2xlarge, M5.4xlarge..)
- For core and task instance fleets, you can assign a target capacity which can be number of vCPU or EC2 instances based on the option selected.
- In the instance fleet when you assign a target capacity, AWS choses any mix of the instances types specified to full fil the target capacity.
- Auto scaling allows scaling out and scaling in of core and task nodes automatically based on the policy set for scaling. This helps EMR handle spike in workload due to high volumes, traffic or other reasons.
- Policy can have scale-out rules that define when to add nodes and scale-in rules that define when to remove nodes. An example of simple auto scaling policy can be to 2 add nodes when yarn memory available % reaches 15% and remove 2 nodes when it reaches 75% of the yarn memory available. There are many parameters on which nodes can be added and removed.
- Auto scaling is not available for instance fleet configuration. Uniform Instance groups can have and each instance group can have its own scaling policy.
- Spot instances are unused EC2 instances that are available for less than on-demand pricing, in fact 90% less sometimes. When you choose spot, you set a max spot price you bid for the spot instances and if it is available you get it for a spot price (not necessarily for the max price) and once you get it, it can be taken away when price go higher than the max price. EMR can use on-demand instances as well as spot instances.
- Both Uniform instance groups and Instance fleets can have spot instances.
- With Instance fleets configuration while specifying the target capacity, we can choose the number of on demand and spot instances required, it can be a mix. Based on the availability AWS fulfils it.
- In case of uniform instance configuration, while adding the instance group we can choose either on demand or spot for that instance group.
- Non critical workloads in production and non production clusters can use spot instances. Choosing spot instance for master node can cause the whole cluster to be terminated when spot instance is taken away. Also choosing all spot instances for Core nodes can be a problem, so choosing a mix of on demand and spot instances for core and opting for all spot instances for task nodes would be a good idea. Storing data in hdfs can be risk data loss if using spot instance.
Uniform instance group configurations are straight forward and they support spot instances as well as auto scaling which is an essential feature for critical workloads. Instance fleets while they don’t support auto scaling but they go well with spot instances because it comes with an option of adding wide range of node instances to an instance fleet. Production loads and critical workloads that need auto scaling should use uniform instance groups with auto scaling and spot instances if required, while other use cases can go with instance fleets.