SSH Start to Finish Architecture – Standing up the CA pt 2

Last week we covered most of the explanation and commands for standing up the SSH CA.  This week, we will cover the Key Revocation List (KRL) and how to inspect the generated certificates.  We’ll also include an asciinema demonstration of the process.  Let’s get started.

When you generate the certificate, one of the flags lets you create an window of validity for the certificate.  This is more than an expiration date, because it also includes the date that the certificate first becomes valid.  This is a great way to create certain user certificates, because you might only want to grant access for a user for four hours over the weekend, and you can issue the certificate during the work week, but it will only work when the begin date/time is reached, and cease working when the end date/time is reached.

Now let’s assume that you issue certificates for a duration of one year at a time for regular employees.  An employee is issued a certificate that becomes valid as of the first of February of the year, which means it would need to be re-issued by the next February.  If the employee changes job roles and no longer needs access to the same sets of servers, that certificate is now a problem.  They have access to systems they should not, and there is a need to revoke that access.  In order to do that, we are supposed to use a Key Revocation List.

The KRL is created by the USER CA in our example, and must be distributed to each host each update.  This distribution facility is much like managing authorized_keys files, which means it can be cumbersome.  It is at least management of just one file, but that’s still one file more than ideal.

From the sshd_config man page:

RevokedKeys
Specifies revoked public keys file, or none to not use one. Keys listed in this file will be refused for public key authentication. Note that if this file is not readable, then public key authentication will be refused for all users. Keys may be specified as a text file, listing one public key per line, or as an OpenSSH Key Revocation List (KRL) as generated by ssh-keygen(1). For more information on KRLs, see the KEY REVOCATION LISTS section in ssh-keygen(1).

Okay.  Now to check ssh-keygen man page:

ssh-keygen is able to manage OpenSSH format Key Revocation Lists (KRLs). These binary files specify keys or certificates to be revoked using a compact format, taking as little as one bit per certificate if they are being revoked by serial number.

serial: serial_number[serial_number]Revokes a certificate with the specified serial number. Serial numbers are 64-bit values, not including zero and may be expressed in decimal, hex or octal. If two serial numbers are specified separated by a hyphen, then the range of serial numbers including and between each is revoked. The CA key must have been specified on the ssh-keygen command line using the -s option.

Remember last week when we said  a serial number was needed for the KRL?  This is why.  The user generates their keys.  They send you their public key for signing.  You sign the key to generate the certificate, and ship them the certificate.  The correct thing to do at this point is to delete your copy of the public key AND the issued certificate.

Now that person had a breach to their workstation, and the key should not longer be trusted.  You can’t just let the certificate expire naturally.  This person had a highly privileged role, and you want to be SURE the authentication is fully revoked, but only for this one certificate that was issued.  You need to issue a KRL statement, but you don’t have a copy of the actual certificate or key to revoke.  In this case you need to revoke by serial number.

The serial number is supposed to be unique, so it is a good idea to create the serial number from a scheme.  You might consider something like “a UID number + some base range.”  If the user’s UID is 2352, and you settle on a base range of four digits, for example, the first serial would be 235320001.  This would get incremented each time a certificate is issued for that user.  Alternatively, log the serial number into a database for each certificate issued, so that it can be searched for quickly.  Whatever works best.

When it comes time to revoke all certificates a user may own (in the case of multiple valid certificates) you can also revoke by ID.

Remember, when generating the certificate, the “-z” flag is for the serial number, and the “-I” flag is for the identity.  When revoking a certificate, you will use the “-k” flag as shown below:

ssh-keygen -s <ca_key> -I <certificate_identity> -u -k
ssh-keygen -s <ca_key> -z <serial_number> -u -k

The reason we also provided the “-u” flag is that it updates the KRL rather than replacing it.  This means we don’t accidentally remove other revocations that should still be present.

What are some of the problems with this solution?

If we tell the server to use a KRL, the file must exist, or sshd will not start.  This means an empty file must be there if it is configured to point to a KRL file, but there are not keys to actually revoke.  If a systems administrator unfamiliar with this removes the file because it is empty, in an attempt to clean up zero length files on the system, it will break sshd on the next restart.

The KRL needs to be managed on a per end point server basis.  This is much like the problem of handling individual authorized_keys files per server.  The reason I specifically mentioned the serial number issue is to pinpoint a scenario where we are not revoking access for a specific user because they are no longer here, but to revoke ONE certificate for that user because of a breach.

There are several ways to deal with the KRL situation.  You could create a script that pulls the KRL from a single site and stick it in a cron job.  You could use an rsync process to push it out to multiple end points every time the file is updated.  Neither of these is ideal, but I do NOT recommend doing something that seems easy, but might cause nightmares in the wee hours of the morning one weekend for some unlucky oncall engineer.  Do NOT point the configuration to a network enabled file share.  If the share were to fall out, the file would no longer be there in the eyes of the sshd, and if nobody noticed, the next service restart (say, by a late night automated OS update) sshd would refuse to start.  You might consider having a network share, but use a script that regularly checks for file to be updated before copying it into place locally.  Whatever you choose, the solution isn’t going to be pretty.

One more note about KRLs.  You can test if a certificate or key is in a revocation list with the “-Q” flag to ssh-keygen.

ssh-keygen -Q -f <KRL_file> <key_or_certificate_file>

And this is a good time to transition to inspecting certificates.  In order to inspect a certificate, use the ssh-keygen “-L” flag.

ssh-keygen -L -f <key_or_certificate_file>

This is what an example file looks like:

$ ssh-keygen -Lf ./.ssh/id_rsa-cert.pub
./.ssh/id_rsa-cert.pub:
Type: ssh-rsa-cert-v01@openssh.com user certificate
Public key: RSA-CERT 04:29:a8:fd:55:04:db:8f:1e:0d:45:18:a7:8e:a7:a6
Signing CA: RSA 27:cc:19:a3:67:1b:5e:2e:6a:48:a9:25:25:6d:64:6c
Key ID: "root"
Serial: 1234
Valid: forever
Principals:
root
Critical Options: (none)
Extensions:
permit-X11-forwarding
permit-agent-forwarding
permit-port-forwarding
permit-pty
permit-user-rc

There was going to be an Asciinema recording.  I went through the whole sequence until it was time to log into the target server.  I generated a certificate from 2014 because the date/time of the beaglebone is off by that much, and now it’s past midnight.  I will re-do this work and upload at a later date after correcting the clock issue on the demonstration CA machine.

I apologize for the lack of a recording at this time.  Thanks for reading!