Inspired by this post, I want to tell you something about my experience with ARM templates and Terraform.
My learning path for Infrastructure as Code has been backwards. My team had been using Terraform for a while when I joined them, so I worked with Terraform before using ARM Templates. This also meant that we had some advanced code setup, so I had to do a deep dive.
It was during Microsoft Ignite 2018 that I started asking myself if Terraform really was the best solution for our environment. So I talked to a few people, did some research and started working with ARM templates. These are my findings.
Update May 10th, 2020
While this post still shows my experience, it is important to take into account that this post has been written in January 2019 and a lot of the information is outdated. The most important changes since then:
- Documentation and examples for Terraform have improved a lot, there are a lot of resources available now.
- ARM templates whatIf functionality is in public preview at this point. This functionality is a lot like Terraform Plan.
I have not worked intensively with Terraform for some time, but when I do again, I might consider an update for this post.
One of the best features of Terraform is without a doubt that it can be used in different environments. AWS, VMWare, Azure, Hyper-V all in the same code. If you are building a solution for multiple customers or a multi cloud environment, that’s an amazing feature that saves a lot of time. But when running Azure only, this feature doesn’t add that much value.
Plan and Apply, one of the nicest features of Terraform for our Azure-only environment.
Terraform gives you the possibility to see what it will do when you tell it to run.
Infrastructure as a Code is a very powerful tool. Like all automation, it can create extremely fast and it can destroy extremely fast. Some kind of control before you point that to your environment is a really good idea. Terraform gives this to you natively. It allows you to run a plan. Terraform will look at the state it thinks the environment should have based on your templates and the state-file. It compares it to the real thing and adds, changes and destroys as needed. It will show you whatever it will do before it runs.
I have created Powershell scripts that try to recreate this step with ARM templates, as I love the functionality. But it’s not native and not nearly as neat as Terraform. To add on, with Terraform you can’t go around it if you would want to. This has saved me a few times when running a typo.
When you create a Terraform deployment, a state file will be created. This has some benefits and some cons.
The state file is what will give us Terraform Plan. It will help us know what Resources will be published and how they will interact with each other. It’s quit flexible as well. For example, you can use taint as a command. This means Terraform will consider this resource as not existing, even if it is present in Azure. We used this a lot to get staging environments in the same state as production. We tainted the resource and Terraform would delete the resource and redeploy it.
On the downside, the state-file needs to be available to all people who deploy. It isn’t that hard to configure an Azure Storage Account and use remote state for that, but the problem is that often some sensitive information is in these files. security can become an issue and needs to be considered.
Terraform gets a lot of praise for being built on Hashicorp’s HCL. A language that tries to find a middle ground between JSON and readability.
And it’s nice. It’s short, simple, easy to read. I do find it took a bit getting used to. I know JSON is suppose to be for machines, but I don’t mind the structure as long as you have some help from tools like Visual Studio Code. When you stick with easy resources, HCL is without a doubt easier. But I speak as someone who had to learn from scratch without understanding the resources underneath, as you do when you learn to work with ARM templates. And then it’s gets real hard real fast and there’s a lot going on.
To give an example, here’s the code for a very basic storage account through Terraform and ARM templates. Terraform even throws in the creation of the resource group (which ARM templates weren’t able to do at all for a long time).
Terraform relies on the releases and an available API to create a new module or provider. The community is big and Microsoft has stated that they plan to work together with Hashicorp. But let me give you an example.
We were using Terraform to roll out some VMs behind an Application Gateway. When Windows Application Firewall was released, we wanted to use this. The Terraform-code for the setup was up and ready. What we couldn’t do, was create the rule exceptions within the WAF. It just wasn’t available yet within Terraform. There are some workarounds for this:
- Create this part with an ARM template within Terraform. This can’t be done with a small part of a resource like this example though. And when you run an ARM template within Terraform, you lose all the advantages of Terraform, like the possibility to plan, apply and destroy resources.
- Create this part with a different kind of automation, like PowerShell or Python.
- Do this part manually. 🙁 .
All of these options might work, but they change the standards we are setting and I see them as workarounds, not solutions.
Now to me, this is a huge one. How hard is it to get started, what resources are available, how accessible is the code.
I got into a Terraform environment with an incredibly complicated structure. There were modules, resources, variables, data sources, remote states. I found it really hard to get the hang of how it is structured.
I learn best by example. See what others have done, figure out what every element does and why choices are made. With Terraform, I found it hard to find those online. All troubleshooting or Google searches lead back to the Hashicorp site. Which is alright, but it’s limited to very straight up information.
When I decided to give ARM templates a chance, I was overjoyed with the amount of information available. There are so many blogs, docs, repositories with examples. And if all else fails, you can put together the resource in the Azure portal and often it will give you a button with an automation option, allowing you to see the ARM template for that resource. Those templates are not perfect for automation, but they are great for understanding the structure of the configuration.
To be fair, my experience with Terraform helped a lot with learning to use ARM Templates. Most times it’s the other way around. But being able to use my own learning style really did help.
It’s all subjective, of course and this is just my opinion.
I want to love Terraform, I really do. I want to love it because it’s a powerful way to work, very flexible, maybe has even more possibilities then ARM templates.
But the truth is I’ve been working with it for more than a year and I still don’t feel like I fully understand it. Once there’s more resources involved, it gets harder to manage as there is so much going on.
I started working with ARM templates a short while ago and I already feel comfortable enough with it to teach my team.
What I really really miss is the plan-step with ARM Templates and I feel the deployment should give a lot more feedback about what it is going to do.
I think the process with Terraform would have been a lot easier if I had done it the other way around and started with ARM templates. Maybe I will give it another go in a few months. But for now, ARM templates win the race for me.