This week I attended #pipelineconf, a new one-day continuous delivery conference in London.
I did a talk with Alex on how we do continuous delivery at Unruly, which seemed well received. The slides are online here
There were great discussions in the open space sessions, and ad-hoc in the hallways. Here’s a few points that came up in discussion that I thought were particularly interesting.
De-segregate different categories of tests
There are frameworks for writing automated acceptance tests, tools for automated security testing, frameworks and tools for performance testing, and tools for monitoring production systems.
We don’t really want to test these things in isolation. If there is a feature requested by a customer that we’re implementing, it probably has some non-functional requirements that also apply to it. We really want to test those along with the feature acceptance test.
i.e. for each acceptance test we could define speed and capacity requirements. We could also run http traffic from each test through tools such as ZAP that try common attack vectors.
Testing and monitoring shouldn’t be so distinct
We often think separately about what things we need to monitor in our production environment and what things we want to test as part of our delivery pipeline before releasing to production.
This often leads us to greatly under-monitor things in production. Therefore, we’re over-reliant on the checks in our pipeline preventing broken things reaching production. We also often fail to spot behaviour degradation in production completely.
Monitoring tools for production systems often focus on servers/nodes first, and services second.
We’d really like to just run our acceptance tests for both functional and non-functional requirements in production against our production systems in the same way that we do as part of our deployment.
This isn’t even particularly hard. An automated test from your application test suite can probably succeed, fail, or generate an unknown failure. These are exactly the states that tools like Nagios expect. You can simply get Nagios to execute your tests.
Monitoring your application behaviour in production also gives you the opportunity to remove tests from your deployment pipeline if it’s acceptable to the business for a feature to be broken/degraded in production for a certain amount of time. This can be a useful trade-off for tests that are inherently slow and not critically important.
Non-functional requirements aren’t
People often call requirements about resilience/robustness/security/performance “non-functional requirements” because they’re requirements that are not for features per se. However, they are still things our customers will want, and they are still things that a our stakeholders can prioritise against features – as long as we have done a good enough job of explaining the cost and risk of doing or not doing the work.
Technical people typically help with coming up with these requirements, but they should be prioritised along with our features.
There’s no point building the fastest, most secure system if no-one ever …