Skip to main content

This documentation is intended for internal use by the RE team.

GDS AWS Account Management Service

Runbook

Creating new AWS accounts

Reliability Engineering are responsible for creating and setting up new AWS accounts for the rest of CO. These should be joined to the CO organisation so the account appears on the consolidated billing. The root credentials should also be secured in a standard way.

Requests for new AWS accounts should be made via the gds request an aws account app. When someone submits a request the app generates new terraform config, which needs to be applied.

The steps are:

Basic account creation

Read the “New AWS account request” email.

The ‘gds request an aws account app’ sends an email to the gds-aws-account-management@digital.cabinet-office.gov.uk address. There are basic instructons in the email to get you started.

Review and merge the pull request

The app generates terraform config in JSON format and opens a pull request with the changes against the aws-billing-account git repo. You need to review the PR, and if it looks sensible, approve and merge it.

You will have noticed that the terraform always includes a "role_name": "bootstrap", line - AWS Organizations will have created a bootstrap role, which we will use, and then delete it later. (Deleting this role does not cause any problems with terraform or AWS Organizations.)

Run the terraform

You will need to be listed as an org admin (billing-org profile) in account_terraform/org_admin_users.tf to perform this (technically you could be a full billing account admin, but please don’t use that for this if you have it).

The aws-billing-account terraform is applied using the gds-cli billing-org profile.

Once in aws-billing-account.git’s terraform directory, run gds aws billing-org bash and then inside the bash session use terraform init and terraform apply.

This will show a plan. Review the output and if you’re happy, continue with the apply.

This is the step where you might run into issues with any punctuation in the tags provided by the requestor. Feel free to edit the tags on your machine and commit and push the fix for review after creation is dealt with.

The terraform will have created the new account and joined it to the organisation. However, it is not able to complete any other steps required.

Note the ID that has been assigned to the new account in brackets. This is the AWS account ID which you will need to insert into gds-cli and possibly provide to the users. (If you miss this ID and clear your terminal, you can dig it back out again through the AWS Organizations API or console.)

Keep a note of the ID and exit the bash session.

Configuring IAM access to the account

Edit the trust relationships on the bootstrap role

We have a small script that is used to assume the bootstrap role and alter it to trust the bootstrapping user. The script still lives in the old re-infra-release-automation repository and should be given the account ID number you found in the terraform output above.

It assumes you have an aws-vault profile named gds-users, which most gds-cli users probably do.

gds-cli bootstrap namedRole

Since we’re all gds-cli users now, this process will assume you want to bootstrap the account using gds-cli, and possibly hand it over to users for them to just use through gds-cli conveniently.

Open gds-cli.git and go to pkg/gds_aws/aws-roles.yaml. You will need to add a new entry to one of the accounts lists in a group (you may also invent a new group if it doesn’t fit into the existing ones). Each account needs the AWS account ID (which you will have from the terraform output above) and a list of roles.

Here’s an example group with a new account:

- name: Reliability Engineering
  accounts:
  - id: '012345678901'
    roles:
    - type: namedRole
      name: re-example
      desc: admin access to re-example
      role_name: bootstrap

Run go build and we’ll use our newly built gds-cli tool in the next step.

Account terraform

Instead of handing access to a bootstrap role to the requesting users and telling them to get on with it, we set up a role for Cyber to have a certain amount of access into the account, and also create their requested initial admin users their own admin roles which are compatible with gds-cli’s adminRole naming convention.

We do this through some terraform that we store in tech-ops-private.

  • Decide whether this will be kept in reliability-engineering, cyber-security, or cabinet-office.
  • Run git log on the directory to find the latest created account there.
  • You should see files like reliability-engineering/terraform/deployments/re-example/account/site.tf and possibly reliability-engineering/terraform/deployments/re-example/account/.terraform.lock.hcl
  • Copy these files into an equivalent new directory. Don’t keep .terraform directories, terraform.tfstate files, terraform.tfstate.backup files, etc.
  • Modify site.tf:
    • Comment out the terraform s3 backend section at the top
    • Change the allowed account ID to the ID of the account you’re bootstrapping
    • Change the state bucket name
    • Change the admin user modules such that there is one for every requested initial admin user, and yourself (you will need it to test everything is working correctly, and the requesting users may come back to you for help and with questions - they may remove the role later themselves)
    • At time of writing, restrict_to_gds_ips should be fine for GDS and CO users, it might need to be disabled for contractors.
    • If you intend to change iam_policy_arns to something other than full admin, change role_suffix, and consider providing a userRole (with a format field of e.g. %s-newsuffixhere) instead of adminRole below.
    • It is expected that all accounts have at least two full admins other than the bootstrapping user.
    • Ensure you keep the gds_security_audit_role module at the bottom.
  • Use ~/gds-cli/gds-cli aws re-example bash and run terraform init and then terraform apply.
    • This is the step where you might run into issues around users not existing - just go get them created the usual way.
  • Return to site.tf and uncomment the s3 backend section. Re-run terraform init and let it upload the statefile (which so far is only on your local machine) to S3.
  • Exit your bash session.
  • Commit your site.tf (and possibly .terraform.lock.hcl) file. Again, don’t bother with the other files (e.g. statefiles) terraform produces.
  • Remember to push and get someone to review.

gds-cli adminRole

Now it’s time to configure gds-cli how we actually tend to use it, in adminRole mode (as programmes become large and we began to hit the size limits on assume role trust policies, we moved away from a single role trusting everyone, known in gds-cli as namedRole, to per-user roles, known in gds-cli as adminRole). Remove the role_name: bootstrap field and change type to adminRole. You should now have something more like this:

- name: Reliability Engineering
  accounts:
  - id: '012345678901'
    roles:
    - type: adminRole
      name: re-example
      desc: admin access to re-example

Run go build and test it out using ~/gds-cli gds-cli aws re-example aws sts get-caller-identity. You should get a result with UserId and Arn fields, and an Account field matching that of the new account you’ve created.

Remember to commit, push, and get someone to review.

Remove bootstrap role

Now you’re done, you can ~/gds-cli/gds-cli aws re-example -l to log into the console, go to IAM, locate the bootstrap role and delete it. (You can also do this via the CLI if you wish.)

Configuring root access to the account

Trigger a password reset for the root user

As the following process involves logging in as the root user for this account, Cyber should first be notified of this activity, preferably being told the account id in question.

In a browser open the AWS console login page. Use the root user email address used when the account was created. This will be in the terraform output. Select the “forgot my password” link.

Generate a long, random password:

$ pwgen -sy 64

You will have received a password reset email to the aws-root-accounts@ email address. Click the reset link in the email and use the password generated above to reset the password and log into the account.

You do not need to store the new password anywhere and should be forgotten. If we need to access the account using the root user we can go through the password reset process using the MFA token, set up in the next step.

Set up MFA for the root user

The MFA for the root user on all new accounts are ultimately stored on one of a set of pairs of Yubikeys, stored in the safe.

This documents how the process is supposed to work outside of COVID restrictions. During COVID restrictions, we use engineer’s official issued yubikeys and a private screen share session with a second (cleared) engineer, until one of the two people are able to visit the safe, log in, remove the MFA device and repeat this process using one of the root key pairs. This second person does not require safe access or root email access.

In the AWS console, click on the account name in the top right, then select My Security Credentials. Then select MFA, then Activate MFA, then Virtual MFA device. This will display a QR code.

Click the “Show secret key for manual configuration”, which will display the key as a string which you can seed the yubikeys with.

Insert one of the two Yubikeys into your laptop. You can view the existing tokens with:

$ ykman oath accounts list
Amazon Web Services:root-mfa@123456720024 (foo-bar-bob)
...

The yubikey can hold up to 32 MFA tokens.

Add the new MFA token:

$ ykman oath accounts add "Amazon Web Services:root-mfa@new account ID" '[MFA token in string form]'

The AWS console will request 2 consecutive codes, which will be generated based on the clock of whichever device you have inserted the Yubikey into, every 30 seconds. The following command will show the current code:

$ ykman oath accounts code 'Amazon Web Services:root-mfa@new account ID'

Remember to add the MFA token to the second yubikey by running the ykman oath accounts add command from above.

Once you have inserted the two consecutive TOTP codes (one from each yubikey please) AWS will check they are correct before saving.

Test you can access the root user using the new credentials

In a private browsing window in your browser open the console and test you can log into the new account as the root user using the password you generated earlier and an MFA token from the yubikeys.

Put the yubikeys back in the safe!!!

The work is not complete until the yubikey is secured back in the safe.

Inform the requestor the account is ready to use.

Tell the requestor the account is ready to use and they can assume the admin role you have created from their gds-users User to gain admin access. They may request a gds-cli release for convenience, if you can arrange one.