I am a huge advocate for infrastructure automation on my team. I love automation. I don’t want to spend too much time convincing you why you should focus more time on automation (if you’re here you’re probably convinced already), but here is a little shortlist of the reasons I think automation is a critical part of any software product.
I hate to break it to you, but you make mistakes. We all do. I constantly mistype things, misread things, forget things, put things in the wrong order. It’s natural. So take yourself out of the equation, and let the computer do it. It’s better at it.
It all takes time: logging into consoles, looking up config, connecting to servers, writing commands. Especially when, as per the previous point, you screw something up and have to fix it. Again, just let the computer do it. Eliminate toil. Get a coffee and use the extra time to write some more automation. Live your life.
You don’t need to maintain setup intructions and checklists that are inevitably neglected and become a liability. Infrastructure as Code / Programmable Infrastructure: the automation you write can live in version control. It can go through peer review processes. You always know what the steps are.
Also, under most automation frameworks configuration and process live side-by-side. So just like you always know how the steps required to put your stacks together, you always know exactly how they are configured if you always build using an automated process.
It’s the early days of a project, everyone is keen to get started. Someone jumps onto the AWS Console and hacks together the infrastructure you need, and everyone rejoices. Until you need to produce your production stack a month later, and no-one quite remembers what went into it. When you attempt it again, you always miss that little something. Then it inevitably catches on fire.
Reproducible means dev/prod parity is a piece of cake, it means immutable infrastructure is easier to achieve, and it avoids snowflake components appearing in your stacks.
“Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy. Avoid writing scripts or custom code to deploy and update your applications?—?automate in a language that approaches plain English.”
When it comes to automation, like anything in software, there are manyoptions out there, each with their own strengths and weaknesses; make sure you have a look around and find what works for you and your team. Ansible is powerful, but flexible and un-opionated, that’s why I use it. So here is another shortlist: my list of highlights when it comes to Ansible. Included within is a rough description of how Ansible ticks, but it is intended to illustrate why I find it effectivie, not to act as it’s own guide to getting started with Ansible—check out the really good Ansible docs for that!!!!
Found a bug? You can fix it. Need an obscure feature? You can write it. That is, if one of the other 3,200 contributors doesn’t do it first. I’m not going to waste time selling you on the benefits of open source, read on.
Flexible configuration management
Ansible configuration management starts with the idea of an inventory.
Our example inventory file, formatted as a .ini file
Each section defined in an inventory is a group, inside which is defined a list of hosts. Instead, a group may also be defined by a list of sub-groups (which is the :children syntax in the definition of the project.ansibled group in the example). So in our example here, we have a total of three hosts defined:
We also have five groups defined, that the hosts inhabit:
· db, api, webapp (which each contain their respective host)
· environment.dev (which contains all three hosts)
· project.ansibled (which also contains all three hosts as it is defined by the environment.dev group as its only sub-group)
Each host specifies a potential target for Ansible commands that we run, and groups control how variables are inherited as well as providing labels to be used when defining what hosts to run Ansible against. Each group can define variables that it wants its hosts to inherit in a .yml file in thegroup!!!_varsdirectory (along with “global” variables defined in an all.yml file). These variable files leverage the Jinja2 templating engine, allowing the developer to reference other defined variables or perform required transformations as possible in Jinja2. Here are some possible example group!!!_var definitions for a couple of the groups of our example inventory file from before:
Configuration management with Ansible group variables
So due to its groups, the db.dev.ansibled host will have db!!!_username: ansibled-dev-user defined, leveraging variables that have been provided by the project.ansibled and environment.dev groups. This separation allows us to extend our inventory and define new configurations easily without repeating ourselves in our configuration.
We can simply define a new environment.staging environment, or even a new project.moreansible project, and we just need to ensure the new db hosts are listed in the correct groups to have our db groups variables resolve as intended for those hosts, without touching our original definitions for db, like so:
Extending our inventory with DB hosts for a staging environment and a new project
With our additions,db.staging.ansibled will resolve db!!!_username: ansibled-staging-user, and db.dev.moreansible will resolve db!!!_username: moreansible-dev-user, just by the way variable inheritence works.
If need be, hosts can also declare their own specific variable definitions in their own .yml file in a host!!!_vars directory that operates in the same way as the group variables. This is useful if a host requires a variable declaration to be an exception to the rules defined by your group definitions.
The result of this system, especially given it is backed by the mature Jinja2 engine (seriously you can do anything with Jinja2), is a powerful and very flexible configuration management system that feels very natural to software developers.
Playbooks and the module library: reduce, reuse, recycle
Ok, so given some defined groups and hosts, how do we define what Ansible is actually going to do? Ansible “code” is defined by YAML plays: a series of tasks that are run against a specified group or host. Plays in turn are defined in a playbook, which serves as the entrypoint for running Ansible. Here’s an example playbook, containing two plays:
An example Ansible playbook, doing… well, doing some really example things
In this example (and it’s definitely just an example, I don’t know why you would do these exact steps), we would run two plays. First, for each db hosts, we perform four tasks: generate a connection details file from a template, create an S3 bucket (if it doesn’t already exist), copy our connection details file to the S3 bucket, and then delete our generated file. Then, secondly, for each api host, we perform two tasks: provision an EC2 instance, and then update a Route53 DNS record for that instance.
When you’re automating, someone has likely tackled it before and then written an Ansible modules for it. The example above already showcases six of them:
· template : generate a file from a Jinja2 template
· s3!!!_bucket : create and manage AWS S3 buckets
· s3 : manage files inside AWS S3 buckets
· file : manage files and directories on local filesystems
· ec2 : provision and manage AWS EC2 instances
· route53 : manage DNS records inside AWS Route53 hosted zones
Ansible has hundreds of modules for tasks ranging from OS tasks such as restarting system services and installing packages, to provisioning AWS cloud components such as load balancers, DNS zones, and database instances.
An important note on hosts
Ansible by default actually expects that the hosts defined in the inventory are actually hostnames to be connected to via SSH so that we can run commands. Generally speaking, in my work, this isn’t what I want, as my hosts are actually “logical hosts”, and I actually want to be executing commands intended for these hosts locally, on the machine where Ansible is running. We can change this default behaviour with a simple declaration in our group!!!_vars/all.yml file:
Freedom: imperative and extensible
And when Ansible’s module library fails to cater to your use case, you aren’t stuck. Ansible has awesome support for loading custom modules and filterswithout any hassle, if you’re handy with Python. Or, if you’re used to administering stacks and environments using Bash scripts and the like, you can easily drop down to system commands using Ansible to execute scripts or utilise CLI tools. The command, shell, and raw modules are all flavours of the ability to run system commands, just with varying levels of Ansible wrapped around them. Most often I find the command module the most useful, as it retains the most power to hook into Ansible’s variable and templating engines.
An important thing about Ansible is that playbooks are imperative, rather than declarative. Some automation and infrastructure management tools work by declaring a state of configuration. With Ansible, by contrast, you define a series of steps that are evaluated and executed for each host. As a software developer, I find that the former is a more expressive, flexible, and natural way to solve problems, as it has more parallels to other high-level programming, and avoids the constraints and structure of declaratively-defined configuration. Ansible playbooks provide support for a variety of handy imperative programming constructs: conditional execution of steps, looping over collections and structures, dynamically including lists of tasks or other playbooks, and more.