Understanding Identity and Access Management Roles for ECS/EKS

Thu Jan 25, 2024

IAM Roles for ECS/EKS: Right Permissions, Right Place

Continuing our journey through the headache that is cloud security, today we are tackling the absolute linchpin of container hygiene: Identity and Access Management (IAM) roles for ECS and EKS.

As a security architect, I spend half my life banging on about the principle of least privilege. But nowhere is this principle ignored more frequently—and more dangerously—than in container orchestration. I cannot tell you how many times I’ve audited an environment where the infrastructure is locked down tight, but the actual container task is running with AdministratorAccess because a developer couldn’t get S3 writes working at 4 PM on a Friday. It is a gaping security hole, and frankly, we need to stop doing it.

The Role Confusion: Workers vs. Managers

IAM roles are the gatekeepers of your AWS estate, but in the world of ECS and EKS, getting them right is the difference between a minor application bug and a full-scale cloud compromise. The trick lies in understanding that not all roles do the same job, and mixing them up is where the trouble starts.

First, you have the Task Role. This is the one people get wrong most often. The Task Role is attached to the specific task (in ECS) or the Pod (in EKS), and it defines what the application code inside the container is allowed to do. If your Python script needs to read a file from S3 and write a record to DynamoDB, those permissions belong here. This is your primary tool for least privilege. If a specific container only calculates tax, it shouldn’t have permissions to delete your customer database. It sounds obvious, but generic, over-privileged “AppRoles” are alarmingly common.

In the ECS world, there is a massive, nuanced distinction between the Task Role and the Task Execution Role that often trips people up. While the Task Role empowers your app, the Task Execution Role is for the ECS agent—the infrastructure itself. It allows ECS to pull your Docker image from ECR and shove logs into CloudWatch. The mistake I see constantly is engineers dumping application permissions into the Execution Role, blurring the lines between the plumbing and the water flowing through it.

Then you have the Service Roles. These are the permissions required by the underlying machinery—the scheduler or the cluster—to keep the lights on. It allows ECS or EKS to manage Elastic Load Balancers, register targets, and spin up or terminate EC2 instances. Generally, AWS handles these Service-Linked Roles for you automatically now, which is a blessing. The goal is to ensure the orchestration layer has the power to manage infrastructure without exposing those permissions to the application code. Your web app container shouldn’t know how to terminate an EC2 instance, and with proper role separation, it never will.

The Art of Least Privilege

Keeping this tidy requires a ruthless approach to permissions. You have to start with zero and add them back one by one. If a developer asks for s3:*, you need to ask why they need to delete buckets when they only need to read objects. It’s painful at first, but it saves your skin later.

For those running EKS, the strategy shifts slightly. You shouldn’t attach IAM policies to the worker nodes (EC2), as that gives every pod on that node the same permissions. Instead, you use IRSA (IAM Roles for Service Accounts) or the newer EKS Pod Identity. This maps an AWS IAM role to a specific Kubernetes Service Account, allowing for granular, pod-level isolation. And a word of warning on AWS Managed Policies: they are convenient, but often overly broad. A “Read Only” managed policy might still grant access to data you consider sensitive, so always review the JSON before you click attach.

Hardening the Container Runtime

We can configure IAM roles perfectly, but if the container runtime is insecure, we are still in trouble. It starts with the golden rule: never, ever run as root. By default, Docker containers run as root, meaning if an attacker breaks out of that container, they effectively have root access on the host node.

Switching this off is trivially easy. In your Dockerfile, you simply create a user and switch to it:

```Dockerfile
FROM alpine:latest
...
USER nonrootuser
```

For an extra layer of paranoia—which is just good security—you should look at User Namespaces. This maps the root user inside the container to a non-privileged user on the host, acting as a failsafe if your configuration slips up.

Beyond identity, you need to strip the container down to the bare metal. Use minimal base images like Alpine or “Distroless” variants. If curl or bash isn’t in the image, an attacker has a much harder time running a script if they do get in. You should also drop Linux “Capabilities” that you don’t need—your web app likely doesn’t need NET_ADMIN rights—and consider mounting the root filesystem as Read-Only. If your app can’t write to the disk, malware struggles to download and save executables.

That wraps up our look at IAM roles and container hardening. It’s a lot of “don’t do this, don’t do that,” but getting these foundations right is what lets you sleep at night while your clusters scale up and down.

Next time, we’ll shift focus to another corner of the cloud. Until then, keep your head in the clouds and your containers locked down.