Nathan Peck
Nathan Peck
Senior Developer Advocate for Generative AI at Amazon Web Services
Aug 31, 2017 6 min watch

Amazon ECS: Core Concepts

I just published a video on the official Amazon Web Service’s YouTube channel, explaining core concepts of Amazon Elastic Container Service:

 

Script:

Hey all, I’m Nathan Peck, and I’m a developer advocate for EC2 Container Service.

Today I going to introduce EC2 Container Service and explain its core concepts and how it can help you to run your architecture in an efficient, automated, and scalable manner. In ECS everything starts with a docker container.

Docker containers are a technology that allow engineers to take their application code, their application runtime engine (node.js for example), and any dependencies and package them up into a “container image”. This container can be delivered to any machine that can run Docker, and be run on that machine.

As part of our container services platform AWS provides the EC2 Container Registry, which is a private registry that you can upload your docker images to, and then control what IAM users on your AWS account or what Amazon systems, such as ECS have permissions to download those images for use. Once you have docker container images built and uploaded to a registry such as EC2 Container Registry the next step is getting those docker images running on instances in the cloud.

This is where ECS comes in. At a high level ECS is a service for automating a fleet of instances to deliver and run your containers across them. ECS keeps track of your instances and how much resources they have, and what they are running. You can make requests to ECS such as “launch three yellow pentagon containers” and “launch two green hexagon containers” and ECS will find instances in your cluster that have the available resources to run those containers and instruct them to download the container from the registry and run it. To get started with ECS you first need to create a cluster. An ECS cluster is just a group of instances that each run Docker, and a lightweight agent provided by ECS. The agent reports back to the centralized ECS control plane to tell it the status of any running containers on the machine, as well as receive instructions to launch more containers.

The easiest way to create your first cluster is to use the wizard in the web console. As you can see you just need to provide a few basic details such as the name of the cluster, the size of the instance you want to run, the VPC to run your instances in, and optionally a private key to use for SSH access to your machines.

You can also use CloudFormation, or the ECS command line interface tool to create your cluster. Once you have a cluster up and running the next thing we need is a task to run on the cluster. Once the application container has been uploaded into a registry we need a way to tell ECS the details about that container image, and this what tasks are for.

This JSON document is a task definition. It defines how to launch a docker container on an instance. You can see that it references an image to download and start, environment variables, links to other containers if applicable, volumes, memory and CPU constraints. These are all properties that can be passed to the docker daemon via the run command.

So in summary this task definition is metadata about how to launch a single instance of your application container as a task on a machine. Tasks are versioned just like your docker containers can be versioned with multiple tags. This allows you to keep a library of consistent, dependable application states. The docker image is a point in time capture of your code and dependencies, and the the task definition is a point in time capture of the configuration for running the image. The next step is telling ECS that we want to run the task.

If I view the list of tasks on the cluster you can see that I have a button to launch a new task. I can also use the ECS API to run a task as well. ECS will respond to my request by finding a machine in the cluster that has spare CPU and memory to run the task, then tell the agent on that instance to communicate with the docker daemon on the instance and start the task. As an engineer, I no longer have to worry about the specifics of running my code on a machine, I just have a farm of machines with compute capacity, and by making a single API call I can launch my code somewhere in that cluster of instances. This becomes even more powerful when we introduce the next level of capability in ECS: the service. Launching a single task in a cluster is fantastic if you have a batch job, or a one-off chunk of computing work that runs to completion and then terminates. But many of us are also responsible for long running code which needs to be available at all times. Maybe this code is a web server that generates HTML for a website, or which powers a backend API, or maybe it is a consumer that needs to always be available to respond to messages arriving in a queue. ECS Services provide capability to power these types of use cases using ECS. Let’s look at a service in the ECS dashboard:

You can see that this service has an associated task definition, a desired count, which is how many copies of that task to run, and a task placement strategy.

The task placement strategy describes how to distribute multiple tasks across your cluster. For this service the strategy is “spread across availability zones (for high availability) and then spread across the instances in each availability zone. (to even out the load across the cluster.)

ECS follows the strategy to launch as many tasks as you define across your cluster, and it keeps track of those tasks. If one of them stops for some reason (perhaps a runtime exception that causes a crash) then ECS launches a replacement. Another feature of the ECS service is that it allows you to update your service with a new task definition version whenever you want. I can make a simple update in the console, or a single API call and ECS will automatically beginning launching my new task and stopping old tasks. I can configure how quickly or how safely I want the tasks to be replaced. By default it will leave old tasks running until the new task has proven to be healthy, so if I try to roll out a new task definition which is unhealthy my service won’t go offline because the old tasks will still be up and running. So that’s the core concepts of EC2 Container Service. To summarize ECS provides the control plane and connective pieces that take your docker images and a cluster of instances and turn them into an automated deployment platform for your application.