Dave Hall Consulting logo

aws

If You’re not Using YAML for CloudFormation Templates, You’re Doing it Wrong

In my last blog post, I promised a rant about using YAML for CloudFormation templates. Here it is. If you persevere to the end I’ll also show you have to convert your existing JSON based templates to YAML.

Many of the points I raise below don’t just apply to CloudFormation. They are general comments about why you should use YAML over JSON for configuration when you have a choice.

One criticism of YAML is its reliance on indentation. A lot of the code I write these days is Python, so indentation being significant is normal. Use a decent editor or IDE and this isn’t a problem. It doesn’t matter if you’re using JSON or YAML, you will want to validate and lint your files anyway. How else will you find that trailing comma in your JSON object?

Now we’ve got that out of the way, let me try to convince you to use YAML.

As developers we are regularly told that we need to document our code. CloudFormation is Infrastructure as Code. If it is code, then we need to document it. That starts with the Description property at the top of the file. If you JSON for your templates, that’s it, you have no other opportunity to document your templates. On the other hand, if you use YAML you can add inline comments. Anywhere you need a comment, drop in a hash # and your comment. Your team mates will thank you.

JSON templates don’t support multiline strings. These days many developers have 4K or ultra wide monitors, we don’t want a string that spans the full width of our 34” screen. Text becomes harder to read once you exceed that “90ish” character limit. With JSON your multiline string becomes "[90ish-characters]\n[another-90ish-characters]\n[and-so-on"]. If you opt for YAML, you can use the greater than symbol (>) and then start your multiline comment like so:

Description: >
  This is the first line of my Description
  and it continues on my second line
  and I'll finish it on my third line.

As you can see it much easier to work with multiline string in YAML than JSON.

“Folded blocks” like the one above are created using the > replace new lines with spaces. This allows you to format your text in a more readable format, but allow a machine to use it as intended. If you want to preserve the new line, use the pipe (|) to create a “literal block”. This is great for an inline Lambda functions where the code remains readable and maintainable.

  APIFunction:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        ZipFile: |
          import json
          import random


          def lambda_handler(event, context):
              return {"statusCode": 200, "body": json.dumps({"value": random.random()})}
      FunctionName: "GetRandom"
      Handler: "index.lambda_handler"
      MemorySize: 128
      Role: !GetAtt LambdaServiceRole.Arn
      Runtime: "python3.7"
		Timeout: 5

Both JSON and YAML require you to escape multibyte characters. That’s less of an issue with CloudFormation templates as generally you’re only using the ASCII character set.

In a YAML file generally you don’t need to quote your strings, but in JSON double quotes are used every where, keys, string values and so on. If your string contains a quote you need to escape it. The same goes for tabs, new lines, backslashes and and so on. JSON based CloudFormation templates can be hard to read because of all the escaping. It also makes it harder to handcraft your JSON when your code is a long escaped string on a single line.

Some configuration in CloudFormation can only be expressed as JSON. Step Functions and some of the AppSync objects in CloudFormation only allow inline JSON configuration. You can still use a YAML template and it is easier if you do when working with these objects.

The JSON only configuration needs to be inlined in your template. If you’re using JSON you have to supply this as an escaped string, rather than nested objects. If you’re using YAML you can inline it as a literal block. Both YAML and JSON templates support functions such as Sub being applied to these strings, it is so much more readable with YAML. See this Step Function example lifted from the AWS documentation:

MyStateMachine:
  Type: "AWS::StepFunctions::StateMachine"
  Properties:
    DefinitionString:
      !Sub |
        {
          "Comment": "A simple AWS Step Functions state machine that automates a call center support session.",
          "StartAt": "Open Case",
          "States": {
            "Open Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:open_case",
              "Next": "Assign Case"
            }, 
            "Assign Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:assign_case",
              "Next": "Work on Case"
            },
            "Work on Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:work_on_case",
              "Next": "Is Case Resolved"
            },
            "Is Case Resolved": {
                "Type" : "Choice",
                "Choices": [ 
                  {
                    "Variable": "$.Status",
                    "NumericEquals": 1,
                    "Next": "Close Case"
                  },
                  {
                    "Variable": "$.Status",
                    "NumericEquals": 0,
                    "Next": "Escalate Case"
                  }
              ]
            },
             "Close Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:close_case",
              "End": true
            },
            "Escalate Case": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:escalate_case",
              "Next": "Fail"
            },
            "Fail": {
              "Type": "Fail",
              "Cause": "Engage Tier 2 Support."    }   
          }
        }

If you’re feeling lazy you can use inline JSON for IAM policies that you’ve copied from elsewhere. It’s quicker than converting them to YAML.

YAML templates are smaller and more compact than the same configuration stored in a JSON based template. Smaller yet more readable is winning all round in my book.

If you’re still not convinced that you should use YAML for your CloudFormation templates, go read Amazon’s blog post from 2017 advocating the use of YAML based templates.

Amazon makes it easy to convert your existing templates from JSON to YAML. cfn-flip is aPython based AWS Labs tool for converting CloudFormation templates between JSON and YAML. I will assume you’ve already installed cfn-flip. Once you’ve done that, converting your templates with some automated cleanups is just a command away:

cfn-flip --clean template.json template.yaml

git rm the old json file, git add the new one and git commit and git push your changes. Now you’re all set for your new life using YAML based CloudFormation templates.

If you want to learn more about YAML files in general, I recommend you check our Learn X in Y Minutes’ Guide to YAML. If you want to learn more about YAML based CloudFormation templates, check Amazon’s Guide to CloudFormation Templates.

Logging Step Functions to CloudWatch

Many AWS Services log to CloudWatch. Some do it out of the box, others need to be configured to log properly. When Amazon released Step Functions, they didn’t include support for logging to CloudWatch. In February 2020, Amazon announced StepFunctions could now log to CloudWatch. Step Functions still support CloudTrail logs, but CloudWatch logging is more useful for many teams.

Users need to configure Step Functions to log to CloudWatch. This is done on a per State Machine basis. Of course you could click around he console to enable it, but that doesn’t scale. If you use CloudFormation to manage your Step Functions, it is only a few extra lines of configuration to add the logging support.

In my example I will assume you are using YAML for your CloudFormation templates. I’ll save my “if you’re using JSON for CloudFormation you’re doing it wrong” rant for another day. This is a cut down example from one of my services:

---
AWSTemplateFormatVersion: '2010-09-09'
Description: StepFunction with Logging Example.
Parameters:
Resources:
  StepFunctionExecRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service: !Sub "states.${AWS::Region}.amazonaws.com"
          Action:
          - sts:AssumeRole
      Path: "/"
      Policies:
      - PolicyName: StepFunctionExecRole
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - lambda:InvokeFunction
            - lambda:ListFunctions
            Resource: !Sub "arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:my-lambdas-namespace-*"
          - Effect: Allow
            Action:
            - logs:CreateLogDelivery
            - logs:GetLogDelivery
            - logs:UpdateLogDelivery
            - logs:DeleteLogDelivery
            - logs:ListLogDeliveries
            - logs:PutResourcePolicy
            - logs:DescribeResourcePolicies
            - logs:DescribeLogGroups
            Resource: "*"
  MyStateMachineLogGroup:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: /aws/stepfunction/my-step-function
      RetentionInDays: 14
  DashboardImportStateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      StateMachineName: my-step-function
      StateMachineType: STANDARD
      LoggingConfiguration:
        Destinations:
          - CloudWatchLogsLogGroup:
             LogGroupArn: !GetAtt MyStateMachineLogGroup.Arn
        IncludeExecutionData: True
        Level: ALL
      DefinitionString:
        !Sub |
        {
          ... JSON Step Function definition goes here
        }
      RoleArn: !GetAtt StepFunctionExecRole.Arn

The key pieces in this example are the second statement in the IAM Role with all the logging permissions, the LogGroup defined by MyStateMachineLogGroup and the LoggingConfiguration section of the Step Function definition.

The IAM role permissions are copied from the example policy in the AWS documentation for using CloudWatch Logging with Step Functions. The CloudWatch IAM permissions model is pretty weak, so we need to grant these broad permissions.

The LogGroup definition creates the log group in CloudWatch. You can use what ever value you want for the LogGroupName. I followed the Amazon convention of prefixing everything with /aws/[service-name]/ and then appended the Step Function name. I recommend using the RetentionInDays configuration. It stops old logs sticking around for ever. In my case I send all my logs to ELK, so I don’t need to retain them in CloudWatch long term.

Finally we use the LoggingConfiguration to tell AWS where we want to send out logs. You can only specify a single Destinations. The IncludeExecutionData determines if the inputs and outputs of each function call is logged. You should not enable this if you are passing sensitive information between your steps. The verbosity of logging is controlled by Level. Amazon has a page on Step Function log levels. For dev you probably want to use ALL to help with debugging but in production you probably only need ERROR level logging.

I removed the Parameters and Output from the template. Use them as you need to.

AWS Parameter Store

Anyone with a moderate level of AWS experience will have learned that Amazon offers more than one way of doing something. Storing secrets is no exception. 

It is possible to spin up Hashicorp Vault on AWS using an official Amazon quick start guide. The down side of this approach is that you have to maintain it.

If you want an "AWS native" approach, you have 2 services to choose from. As the name suggests, Secrets Manager provides some secrets management tools on top of the store. This includes automagic rotation of AWS RDS credentials on a regular schedule. For the first 30 days the service is free, then you start paying per secret per month, plus API calls.

There is a free option, Amazon's Systems Manager Parameter Store. This is what I'll be covering today.

Structure

It is easy when you first start out to store all your secrets at the top level. After a while you will regret this decision. 

Parameter Store supports hierarchies. I recommend using them from day one. Today I generally use /[appname]-[env]/[KEY]. After some time with this scheme I am finding that /[appname]/[env]/[KEY] feels like it will be easier to manage. IAM permissions support paths and wildcards, so either scheme will work.

If you need to migrate your secrets, use Parameter Store namespace migration script

Access Controls

Like most Amazon services IAM controls access to Parameter Store. 

Parameter Store allows you to store your values as plain text or encrypted using a key using KMS. For encrypted values the user must have have grants on the parameter store value and KMS key. For consistency I recommend encrypting all your parameters.

If you have a monolith a key per application per envionment is likely to work well. If you have a collection of microservices having a key per service per environment becomes difficult to manage. In this case share a key between several services in the same environment.

Here is an IAM policy for an Lambda function to access a hierarchy of values in parameter store:

To allow your developers to manage the parameters in dev you will need a policy that looks like this:

Amazon has great documentation on controlling access to Parameter Store and KMS.

Adding Parameters

Amazon allows you to store almost any string up to 4Kbs in length in the Parameter store. This gives you a lot of flexibility.

Parameter Store supports deep hierarchies. You will find this becomes annoying to manage. Use hierarchies to group your values by application and environment. Within the heirarchy use a flat structure. I recommend using lower case letters with dashes between words for your paths. For the parameter keys use upper case letters with underscores. This makes it easy to differentiate the two when searching for parameters. 

Parameter store encodes everything as strings. There may be cases where you want to store an integer as an integer or a more complex data structure. You could use a naming convention to differentiate your different types. I found it easiest to encode every thing as json. When pulling values from the store I json decode it. The down side is strings must be wrapped in double quotes. This is offset by the flexibility of being able to encode objects and use numbers.

It is possible to add parameters to the store using 3 different methods. I generally find the AWS web console easiest when adding a small number of entries. Rather than walking you through this, Amazon have good documentation on adding values. Remember to always use "secure string" to encrypt your values.

Adding parameters via boto3 is straight forward. Once again it is well documented by Amazon.

Finally you can maintain parameters in with a little bit of code. In this example I do it with Python.

Using Parameters

I have used Parameter Store from Python and the command line. It is easier to use it from Python.

My example assumes that it a Lambda function running with the policy from earlier. The function is called my-app-dev. This is what my code looks like:

If you want to avoid loading your config each time your Lambda function is called you can store the results in a global variable. This leverages Amazon's feature that doesn't clear global variables between function invocations. The catch is that your function won't pick up parameter changes without a code deployment. Another option is to put in place logic for periodic purging of the cache.

On the command line things are little harder to manage if you have more than 10 parameters. To export a small number of entries as environment variables, you can use this one liner:

Make sure you have jq installed and the AWS cli installed and configured.

Conclusion

Amazon's System Manager Parameter Store provides a secure way of storing and managing secrets for your AWS based apps. Unlike Hashicorp Vault, Amazon manages everything for you. If you don't need the more advanced features of Secrets Manager you don't have to pay for them. For most users Parameter Store will be adequate.

Migrating AWS System Manager Parameter Store Secrets to a new Namespace

When starting with a new tool it is common to jump in start doing things. Over time you learn how to do things better. Amazon's AWS System Manager (SSM) Parameter Store was like that for me. I started off polluting the global namespace with all my secrets. Over time I learned to use paths to create namespaces. This helps a lot when it comes to managing access.

Recently I've been using Parameter Store a lot. During this time I have been reminded that naming things is hard. This lead to me needing to change some paths in SSM Parameter Store. Unfortunately AWS doesn't allow you to rename param store keys, you have to create new ones.

There was no way I was going to manually copy and paste all those secrets. Python (3.6) to the rescue! I wrote a script to copy the values to the new namespace. While I was at it I migrated them to use a new KMS key for encryption.

Grab the code from my gist, make it executable, pip install boto3 if you need to, then run it like so:

copy-ssm-ps-path.py source-tree-name target-tree-name new-kms-uuid

The script assumes all parameters are encrypted. The same key is used for all parameters. boto3 expects AWS credentials need to be in ~/.aws or environment variables.

Once everything is verified, you can use a modified version of the script that calls ssm.delete_parameter() or do it via the console.

I hope this saves someone some time.