Amazon Elastic Container Service (ECS) πŸš€ is a scalable and flexible container management service that allows you to run, stop, and manage Docker containers in a cluster. While AWS ECS provides a web-based console for managing containers, you sometimes want to log in to the container and execute commands. Especially if your deployments of tasks are failing, debugging can be bothersome via the CloudWatch logs only.

In this article, we'll show you how to execute commands to manage your Fargate containers in AWS ECS. πŸ’» This setup utilizes the SSM agent which is preinstalled on all Fargate instances. If you choose another launch type the setup will look differently.

Table of contents

Setup of IAM roles πŸ‘₯
Configure the ECS task βš™οΈ
Update the ECS service πŸ”„
Access the container πŸ’»
Troubleshooting πŸ”

Setup of IAM roles πŸ‘₯

First, we must set up the required roles and policies, which must be attached to the ECS tasks and EC2 instance/IAM user. In total, we require three roles:

  • Task role
  • Task execution role
  • EC2/IAM user role

Task role

An AWS ECS task role is a way to grant permissions to containers running within an ECS task. We create a new role, with the following policy attached:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ssmmessages:CreateControlChannel",
                "ssmmessages:CreateDataChannel",
                "ssmmessages:OpenControlChannel",
                "ssmmessages:OpenDataChannel"
            ],
            "Resource": "*"
        }
    ]
}

Next, we add a trust relationship with AWS ECS so that we can attach the role:

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Service": "ecs-tasks.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Task execution role

A task execution role in AWS ECS is a role that grants permissions to the ECS container to manage resources on your behalf, such as pulling Docker images from Amazon ECR.

The role doesn't need to have any special setup, as the container doesn't need to call any services on our behalf. I recommend having at least the following AWS-managed policies attached AWSOpsWorksCloudWatchLogs, AmazonEC2ContainerRegistryReadOnly and AmazonEC2ContainerServiceRole. But this setup depends on you. If you need to call other services like AWS RDS, you need to add the appropriate policies. As this is not part of this article, I will skip this section.

EC2/IAM user role

The last role we create will be attached to the EC2 instance or IAM user, which is executing the commands. In my case, I will use an EC2 instance for that. The following policy shows the minimal setup with the actions ecs:DescribeTasks and ecs:ExecuteCommand:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "ecs:DescribeTasks",
            "Resource": "arn:aws:ecs:<region>:<accountID>:task/<clusterName>/*"
        },
        {
            "Effect": "Allow",
            "Action": "ecs:ExecuteCommand",
            "Resource": [
                "arn:aws:ecs:<region>:<accountID>:task/<clusterName>/*",
                "arn:aws:ecs:<region>:<accountID>:cluster/<clusterName>"
            ],
            "Condition": {
                "StringEquals": {
                    "ecs:container-name": "<containerName>"
                }
            }
        }
    ]
}

As the container task ids constantly change, you should use conditions in your IAM policy to restrict access.

In my case, I am using an EC2 instance, so I set up a trust policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ec2.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Configure the ECS task βš™οΈ

Now that we have the required roles, we need to adjust our task definition. In the containerDefinitions we need to enable the following line:

"linuxParameters": {
	"initProcessEnabled": true
}

The whole task definition would look like this:

{
    "family": "<name>",
    "containerDefinitions": [
        {
            "name": "alpine",
            "image": "<accountID>.dkr.ecr.<region>.amazonaws.com/<imageName>:latest",
            "cpu": 0,
            "portMappings": [
                {
                    "name": "80-tcp",
                    "containerPort": 80,
                    "hostPort": 80,
                    "protocol": "tcp",
                    "appProtocol": "http"
                }
            ],
            "essential": true,
            "environment": [],
            "mountPoints": [],
            "volumesFrom": [],
            "linuxParameters": {
                "initProcessEnabled": true
            },
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/ecs/<clusterName>",
                    "awslogs-region": "<region>",
                    "awslogs-stream-prefix": "ecs"
                }
            }
        }
    ],
    "taskRoleArn": "arn:aws:iam::<accountID>:role/ECSTaskRole",
    "executionRoleArn": "arn:aws:iam::<accountID>:role/ECSTaskExecutionRole",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "1024",
    "memory": "3072",
    "runtimePlatform": {
        "cpuArchitecture": "X86_64",
        "operatingSystemFamily": "LINUX"
    }
}

Update the ECS service πŸ”„

Finally, we need to update the service with the flag --enable-execute-command. I usually change the desired count of tasks running:

aws ecs update-service --cluster fargate-debug --enable-execute-command --task-definition fargate-debug --service alpine --desired-count 1

Access the container πŸ’»

We are all set up to access the container now. We can do that via the AWS CLI, which utilizes the SSM agent preinstalled on Fargate instances:

aws ecs execute-command --cluster <clusterName> --task "arn:aws:ecs:<region>:<accountID>:task/<cluster>/<taskID>" --container alpine --interactive --command "/bin/sh"

With the flag, --command "/bin/sh" we can run any command. Using /bin/sh we start an interactive shell. πŸŽ‰

Troubleshooting πŸ”

This setup is very error-prone. The most common issues I have encountered are:

An error occurred (InvalidParameterException) when calling the ExecuteCommand operation: The execute command failed because execute command was not enabled when the task was run or the execute command agent isn’t running. Wait and try again or run a new task with execute command enabled and try again.

Here you need to check if you have adjusted the ECS service. Make sure to run the update-service command and a new task will be spun up.

An error occurred (AccessDeniedException) when calling the ExecuteCommand operation: User: arn:aws:sts::<accountID>:assumed-role/SSMSessionmanagerRole/i-<instanceID> is not authorized to perform: ecs:ExecuteCommand on resource: arn:aws:ecs:<region>:<accountID>:cluster/<clusterName> with an explicit deny in an identity-based policy

Check the IAM roles you have created for the EC2 instance/user that is executing the command.

Share this post