Stefana Muller had a good post recently titled 5 challenges to continuous deployment in which she discusses the use of scripts for deployment. Most of the folks I’ve talked with recently are using deployment scripts and many do have issues with them so I thought it’d be worth sharing some of their experiences.
No one updates the scripts until there’s a failure.
In most companies the scripts are not treated as part of the product. They’re merely a tool. Dev worries about features and architectural issues and leaves the tools to the DevOps team. No, that’s not how DevOps is supposed to work, but that’s what we find again and again in operations both large and small. Thus it shouldn’t surprise anyone that Dev seldom worries about how their changes affect the tool and therefore inevitably issues with the tool are found only after the fact – during deploy.
Scripts begin to conflict with each other.
Actually, this was our first introduction to the problem. While working with a beta customer for our VCTR system we were on a web-call with him watching as he updated an install script. After a 5 minute change he tried to run, but found another script that failed and required updating because they’d made a small change in architecture. That led to another and so on until two hours had passed.
Deployment scripts aren’t architected for reliability
After all, they’re just a tool. If you’re old enough to remember run books (the thing, not the company) you’ll remember that at least half the runbook was comprised of checks of previous steps. Unfortunately, not one of the home brewed deployment systems I’ve seen handles such quality checks. Rather, their fire and forget. Worse, if they don’t work the first time the response is almost always “run it again.”
To paraphrase the CTO for a large financial firm “bolting scripts together isn’t automation”
When you start out using scripts for deployment most folks don’t realize they’re going to build an entire system so they don’t spend time architecting a full solution. Consequently, they end up building and rebuilding as the app grows, as new deployment theaters are added, as architectures change, or as failures occur. Each cycle wastes manpower, causes delays and frustration, and doesn’t result in a complete solution because that isn’t part of the plan.
If you manage to build a deployment system, you’ll have missed some use cases.
I was at a recent meetup where a very bright engineer was describing the deployment system he’d built for a recent project. Having been through the wringer above, this time he built a full deployment system. When someone in the audience asked about dependencies between containers and services he answered “that’s information you have in your head so you have to keep that in mind when promoting a build to production and promote them in sets.” So having spent time to build a deployment system, an important condition for a deployment to work was still left to the user.
If you get this far, the deployment system is now your responsibility
A large SaaS company dev VP was describing the deployment system he’d built for their primary service before moving into his current role. It had been working well for several months but he couldn’t maintain it anymore. To compound the situation the company now wanted the same capability for all their other services. He was left looking for a way to dedicate a body or two to expanding and maintaining the system or finding a replacement.
Plus, you’ll require a body just for scripts.
He’s not the only one to have to dedicate a body to deployment. When we first start building out our toolchain, pipeline scripts were the natural starting point. After all, that’s what most of the literature suggests and there are few commonly known tools for the job. We, ourselves ended up with an engineer spending most of his time updating scripts.
Deploying a complex, multi-container application doesn’t require a tool – it requires a system. One that takes into account all the dependencies between individual containers, 3rd party services, architectural changes, different target environments, what’s currently running and of course checks for and deals with failures along the way. That’s why we built Skopos, and now you can use it too.