Howard Dierking

The Role of QA in Cloud DevOps

[Warning: Potentially Inflammatory Topic]

Today was a bit of a rough day for me in that I did something that is generally a bit aggressive for me. I abruptly closed an issue that was opened by one of our QA folks, thereby halting the discussion. The issue wasn’t really a bug - even though it was opened as one. It was more of a questioning of the API’s resource naming strategy (more specifically, a question around the use of singular and plural nouns for a specific path segment).

To be really honest with you, I’m not sure whether I did the right thing weighing in as forcefully as I did (my PM told me in no uncertain terms that I made a mistake). When I look back on today in a few months, I may conclude that the way I handled this issue was more akin to that of an irritated dev or dev lead rather than a dev manager and certainly not a director. If so, this is just one of those hard lessons that I get to learn.

It has been bugging me though, and in that, has given me the opportunity to reflect on why I closed the issue, what were some of my underlying beliefs and assumptions, and what I think should be done differently with regard to QA in general - especially in the new world of DevOps, networks of services, and the cloud.

First, a little context for why I closed the issue. It was the culmination of a few things:

Now, it’s possible that I’m forming opinions based on missing context and bad assumptions. If so, I’m more than happy to own that. However, for QA folks who believe that the role of QA is to write “traditional” functional test automation, let me at least give you the perspective of a dev manager trying desperately to build a mature, large scale cloud system. There are a 2 main problems with the traditional approach.

  1. The bar for writing functional tests has become insanely low. Low enough to the point where it adds just a small amount of additional value on top of the unit tests that the dev team owns. In fact, in some cases, the technology bar for writing functional tests is low enough that the dev team could choose to go ahead and write/maintain those as well.
  2. The stuff that really matters for my team is all of the stuff around the code. For example, at what point do I saturate an instance? Do I have autoscaling configured correctly? Can I really do a 0 downtime deployment? Did this new algorithm that hurt maintenance actually improve latency? Should we enable CDN?

Let me put it another way. I want QA as a partner in driving value to the service - not as a housekeeper dusting and arranging the furniture to make the room slightly more aesthetically pleasing.

The role of QA needs to change from writing tests that verify functional (and perhaps nonfunctional) requirements to setting up the baselines - the scaffolding of measures and alarms - that will give the dev team freedom to experiment on all aspects of the system (both code and infrastructure) with confidence.

I believe that this is the best way for QA to drive significant value in a world where the dev team needs to continue iterating faster and faster.

But again, this is just my perspective based on the context I have. What say you? Are you in a dev organization that has QA support - and what does that look like? Are you in QA and think that I’m completely out in left field? That’s fine too. Let me know!