Cloud devs will eat your lunch
A partner at our shop recently talked to a mid-sized software company about their product. The product was a successful <redacted> solution deployed on-premises at <redacted> around the country. The discussion started with talk about infrastructure work, but it evolved quickly once our associate learned some of the details of their process: Manual builds. Manual testing. No guardrails. A cycle time of 6 months. And a biannual trip to each client site to install new software.
Maybe this is familiar to you. Having been a part of the software industry for 17 years now, I’ve worked with many companies big and small who were in the same position. Their software was “too complicated” to add build automation. They worked “too fast” to create test infrastructure. And their clients would “never accept” a cloud deployment.
It’s bizarre that, in an industry all about technology, there are still so many who don’t think they are impervious to change. Their first product most likely put some old mainframe software out of commission, and that mainframe software likely replaced a bunch of paper handlers, yet they don’t think they could be made similarly obsolete by a transition to the cloud. And yet, here I am writing an article convincing you to either become part of the future, or be left in the past. Here’s why:
Cloud Software Needs TO be Nimble
Indeed, in days past, a 6 month development cycle was the essence of nimble. After all, when every delivery involved a complicated migration and installation, no one could afford to be delivering every week. The cloud revolution has reversed the incentives, however. Not only can we deploy new software to the cloud every Friday, but we have to in order to create a successful product. Cycle time is king, and we are its subjects.
The reason for this is simple: we aren’t building a monolith. Every new feature means new services, containers and procedures. If we deploy them all as a stack, we risk watching it all fall apart, as so many Jenga blocks. When released incrementally, we can see the problems and resolve them before our customers have time to fill out a bug report. And all while our competitors are sending out an army of developers with flash drives to fix their production issues.
Cloud Architecture Saves Time
So, if you aren’t already making cloud software, you might be wondering, how can any team possibly keep up one week cycle times? That’s not enough time to design, develop, QA and deploy a project, is it? The answer is that the entire team has to be working at full efficiency.
As an undergraduate Computer Science student, Frederick Brooks’ The Mythical Man-Month was required reading. The book talks about the pitfalls for large software projects and the inevitable failure of many of our most ambitious projects. Among the strongest cases Brooks makes is that the more developers on the project, the more communication pathways to support, the slower work gets done. After all, if a line of code written by any engineer on the project could potentially break things for everyone else, you have to move slowly to keep everyone in sync.
There are plenty of solutions to this problem that I’ve seen deployed in practice, and it is by no means impossible. But cloud software architecture by its very nature forces us to solve this problem. Rather than focusing on large, complicated features, we have to focus on small, well-defined reliable parts. When new features need to be done in parallel, we don’t add developers to the project, we add teams to the project, giving them specifications and data to work against and allowing them to do what they do best: build things.
Always Be Building
The final piece of the cloud puzzle is tooling. Sure, the teams are small and well contained, but how are they more efficient? The answer is the same for teams now as it was 17 years ago when I started maintaining source repositories at a 100 developer software shop: automation. At that first job, I met an engineer who said something that has stuck with me ever since:
When I need to do something more than once, I automate. When I have to do something once, if I think I’ll ever have to do it again, I automate. And I’m pretty pessimistic about having to do things only once.
The message I took was that you have to be proactive about automation, because anything you don’t automate will end up being your entire job. Your developers aren’t writing code all day, they’re doing the processes that allow them to start coding all day. If you aren’t looking for every way to keep your developers looking at code, you’re paying them to press buttons all day.
And again, cloud software development needs automation. You can’t build and deploy these stacks of disparate services and systems without a fully enabled CI/CD system hooked into your source control, running a test suite and test deployment every time your developers merge to make sure your system is always no more than one rollback from working optimally. It is difficult, and it’s expensive, but once you’ve done it all, your developers don’t step on each others’ toes, your QA department doesn’t spend 3 weeks testing releases only to send it back with trivial bugs, and your operations personnel don’t download and build distributions to send to customers.
You just build.
Cloud Software is the Future
So you believe me now (hopefully!), but you’re still not sold. “We can’t deploy to the cloud, because of HIPAA, FERPA, or <insert acronym> security requirements! We’re a special case!” We get this one a lot. And 5 years ago, you were right. But now, you’re not.
Google Cloud Platform has software support for HIPAA. Amazon too. At Autodeus Engineering, we’ve done work on both. Azure has support for Top Secret data. Amazon (of course) does as well. If the nation’s most important secrets are safe in the cloud, what makes you think your client’s aren’t?