Introduction
While building Lambda-based automation for patch management and security hardening, I frequently ran into a frustrating problem: some EC2 instances failed during software installation or SSM command execution. Usually the root cause was that the instance wasn’t a Managed Instance in SSM.
An EC2 instance must be managed by SSM in order to use features like Run Command, Patch Manager, or inventory collection. If it’s not managed, automation fails silently or unpredictably.
To debug this reliably, I started using the AWSSupport-TroubleshootManagedInstance
runbook, which checks network paths, IAM roles, and VPC settings. In this post, I’ll show you how to launch the runbook and walk through what it checks behind the scenes - helping you fix SSM issues fast.
Step 1: Launch the SSM Troubleshooting Runbook
You can start the runbook directly from the AWS Systems Manager console:
- Navigate to Systems Manager > Documents
- Search for
AWSSupport-TroubleshootManagedInstance
- Choose Simple execution, Rate control, Multi-account and Region, or Manual execution - depending on your needs
Once launched, the runbook automatically performs several diagnostic steps to determine why the EC2 instance is not recognized as a Managed Instance by Systems Manager.
🛡️ Note: The role executing the runbook must have at least the following permissions to perform these checks successfully:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ssm:DescribeInstanceInformation",
"ec2:DescribeNetworkInterfaces",
"iam:GetInstanceProfile",
"ec2:DescribeVpcs",
"ec2:DescribeInstances",
"iam:ListAttachedRolePolicies",
"ssm:GetServiceSetting",
"ec2:DescribeVpcEndpoints",
"ec2:DescribeNetworkAcls",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups"
],
"Resource": "*"
}
]
}
This allows the runbook to query the relevant EC2, IAM, and VPC configuration needed to validate connectivity and permissions.

aws ssm start-automation-execution --document-name "AWSSupport-TroubleshootManagedInstance" --parameters "InstanceId=i-07ad37999d96ee3e8,AutomationAssumeRole=arn:aws:iam::<accountID>:role/SSMTroubleshootManagedInstanceRole"
Step 2: What the Runbook Actually Does
The power of this runbook lies in the sequence of checks it performs. Below is a breakdown of each step:
✅ GetPingStatus
: Check Connection to SSM
This step uses the DescribeInstanceInformation
API to determine if the instance is already connected to SSM. If it is, the runbook exits early. If not, the diagnosis continues.
🔀 BranchOnIsInstanceAlreadyOnline
: Conditional Branching
The runbook branches here: if the instance is already online in SSM, it skips all further steps. Otherwise, it proceeds to identify the root cause.
🔍 GetEC2InstanceProperties
: Gather Instance Metadata
Collects metadata such as:
- Subnet ID
- VPC ID
- Private IP
- Security Groups
- IAM instance profile
This data is essential for validating network and IAM setup.
🌐 CheckVpcEndpoint
: Validate VPC Endpoint for SSM
Checks whether a Systems Manager VPC Endpoint (interface type) exists in the VPC. If it does:
- Validates that security groups attached to the endpoint allow inbound TCP 443 from the instance’s private IP or SG.
🛣️ CheckRouteTable
: Ensure Route to SSM
Looks at the subnet’s route table to confirm:
- There’s a route to either the SSM VPC endpoint (preferred), or
- A route to the public SSM endpoint (via Internet Gateway, if no VPC endpoint is used)
🔐 CheckNacl
: Inspect Network ACLs
Ensures that the NACLs on the subnet allow inbound and outbound HTTPS (443) traffic to/from SSM.
🔄 CheckInstanceSecurityGroup
: Outbound Rules
Validates that the instance’s security group allows outbound traffic to:
- The SSM VPC endpoint, or
- The public Systems Manager endpoint
Even with VPC endpoints, outbound SG rules are still required.
🔑 CheckInstanceIAM
: Verify IAM Role and Account Settings
Checks whether:
- The IAM instance profile includes
AmazonSSMManagedInstanceCore
or equivalent permissions. - The account has Default Host Management Configuration enabled (optional fallback).

1. Checks for Amazon VPC Systems Manager VPC Endpoint 'com.amazonaws.us-east-1.ssm':
- [INFO] No VPC endpoint for Systems Manager found on the EC2 instance VPC: vpc-0659a354588e79688.
2. Checks for the VPC route table entries of the instance's subnet 'subnet-02b79aec564cde270:'
- [INFO] VPC route table found: rtb-03f9fb735c4d0263b.
- [INFO] VPC local route (default route) available for 172.31.0.0/16.
- [WARNING] A local route is required to communicate with the VPC endpoint interface.
- [INFO] VPC Internet route with destination 0.0.0.0/0 found with target 'igw-04e402dfdc74364bf'.
- [WARNING] VPC internet gateway 'igw-04e402dfdc74364bf' route associated, however the instance does not have a public IP address associated. Internet connectivity through the internet gateway is unavailable.
- For more information about routing options see https://docs.aws.amazon.com/vpc/latest/userguide/route-table-options.html#route-tables-vpc-peering
- For more information about route tables see https://docs.aws.amazon.com/vpc/latest/userguide/route-table-options.html#route-tables-vpc-peering
3. Checks for NACL rules of the instance subnet 'subnet-02b79aec564cde270':
- Check network ACLs requirements instance 'i-07ad37999d96ee3e8' for instance subnet 'subnet-02b79aec564cde270':
- Check network ACLs requirements on network ACL 'acl-0f49be11fb95a1ec2':
- [OK] 'ALL' outbound traffic allowed to '172.31.16.253' from '[1024, 65535]'
4. Checks EC2 instance 'i-07ad37999d96ee3e8' security groups outbound traffic:
- Check outbound traffic to the public Systems Manager endpoint:
- [INFO] Instance security group 'sg-0ab244b3363a23c1b' allows outbound traffic on port '443' to '0.0.0.0/0'.
- [OK] Instance security group 'sg-0ab244b3363a23c1b' allows outbound traffic on port '443' to public System Manager endpoint.
5. Checks EC2 instance IAM profile and required permissions:
- Check Default Host Management Configuration:
- [INFO] Default Host Management Configuration is Default.
- Check for AWS managed policies attached to the instance profile 'EC2SSMSessionManagerRole':
- [OK] Found an AWS managed policy attached to the instance profile 'EC2SSMSessionManagerRole' with required permissions.
6. Additional Troubleshooting:
- Starting with the SSM Agent version 3.1.501.0, you can use the 'ssm-cli' tool to diagnose issues at the operating system level.
- Troubleshooting managed node availability using ssm-cli:
- https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-cli.html
- Troubleshooting reference:
- https://repost.aws/knowledge-center/systems-manager-ec2-instance-not-appear
- https://docs.aws.amazon.com/systems-manager/latest/userguide/troubleshooting-ssm-agent.html
Member discussion