SSH to EC2

Wed 30 August 2017 by Moshe Zadka

(Thanks to Donald Stufft for reviewing this post, and to Glyph Lefkowitz for inspiring much of it.)

(JP Calderone wrote a Twisted version of this approach.)

It is often the case that after creating an EC2 instance in AWS, the next step is SSHing. This might be because the machine is a development machine, or it might be tilling the ground for a different remote control: for example, setting up a salt minion.

In those cases, many either press y when seeing SSH prompt them about an unknown host key, or even turn off host key verification altogether. This is convenient, quick, and very insecure. A man in the middle can use this to steal credentials -- maybe not permanently, but enough to log in into any other machine with the same SSH key.

The correct thing to do is to prepare the SSH configuration by retrieving the host key via the AWS API. Unfortunately, doing it is not trivial.

Fortunately, it is a good example of how to use the AWS API from Python.

import sys
import boto3

client = boto3.client('ec2', region_name='us-west-2')
resource = boto3.resource('ec2', region_name='us-west-2')

output = client.get_console_output(InstanceId=sys.argv[1])
result = output['Output']

rsa = [line for line in result.splitlines()
            if line.startswith('ssh-rsa')][0]

instance = resource.Instance(sys.argv[1])
known_hosts = '{},{} {}\n'.format(instance.public_dns_name,
                                  instance.public_ip_address,
                                  rsa)

with open(os.path.expanduser('~/.ssh/known_hosts'), 'a') as fp:
    fp.write(known_hosts)

Let's go through this script section by section.

import sys
import boto3

We import the sys module and the first-party AWS module boto3.

client = boto3.client('ec2', region_name='us-west-2')
resource = boto3.resource('ec2', region_name='us-west-2')

It is often confusing what functionality is in client and what is in resource. The only rule I learned in a year of using the AWS API is to look in both places, and create both a client and a resource. In general, client maps directly to AWS low-level REST API, while resource gives higher level abstractions.

output = client.get_console_output(InstanceId=sys.argv[1])
result = output['Output']

This is the meat of the script -- we use the API to get the console output. These are the boot up messages from all services. When the SSH server starts up, it prints its key. All that is left now is to find it.

rsa = [line for line in result.splitlines()
            if line.startswith('ssh-rsa')][0]

This is a little hacky, but there is no nice way to do it. There are other possible heuristics. The nice thing is that if the heuristic fails, this will result in connection failure -- not an insecure connection!

instance = resource.Instance(sys.argv[1])
known_hosts = '{},{} {}\n'.format(instance.public_dns_name,
                                  instance.public_ip_address,
                                  rsa)

We grab the IP and name through the resource, and format them in the right way for SSH to understand.

with open(os.path.expanduser('~/.ssh/known_hosts'), 'a') as fp:
    fp.write(known_hosts)

I chose to update known_hosts like this because originally this script was in a throw-away Docker image. In other cases, it might be wise to have a separate known hosts file for EC2 instances, or have an atomic update methodology.

After running this code, it is possible to SSH without having to verify the host key. It is best to set the SSH options to fail if the host key is not there, for extra safety.

An alternative approach is to use the AWS API to set the SSH secret key. However, this is, in general, even less trivial to do securely.