Testing, in production

Testing software, somewhat surprisingly, is a contentious subject. Not whether testing should be done or not, but the right way to go about it. We test to give ourselves confidence, confidence that the software does what we expect of it, and confidence that it doesn’t break down.

When I first started professional software engineering, testing was limited to what the developer did as they wrote their code. Mainstream unit testing hadn’t quite hit at that point, so what testing was done was quite manual.

The developer tested things manually, the product owner/champion did some (manual) user acceptance testing, then, boom, your code was released into the wild, where upon the users would find bugs and the developer’s phone ran red hot for a whiles until all the bugs had been ironed out.

This was termed “Testing in production”, and, apart from manual testing by dedicated testers (if you were lucky enough to be working at a company that employed some) was about the limits of testing.

Automated testing was on the horizon though, the essential idea is that a computer can run tests over and over, without getting bored, and faithfully reports when the output of the software didn’t match what was expected, for given input.

Once automated testing became mainstream developers, owners, managers, and users, became confident that tests had been run that determined that the software would perform as claimed.

Alas, that confidence is misplaced.

The truth is, there are limitations on tests, they have always been there, but the allure of automated testing obscured the problem a little.

This issue exists for TDD, BDD, Unit testing, End to End testing, Fuzzing, Property testing, and a whole host of other testing regimes.

Allow me to demonstrate, to hopefully show the issue clearly.

Let’s say we have a piece of code, and we run a single test on it

func Foo(a, b int) int {}

func TestFoo(t *testing.T) {
    output := Foo(2,2)
    if output != 4 {
        t.Fail("Test failed")
    }
}

The implementation ofFoo is hidden because I want to show you something. The test function provides the code being tested with two 2s, and expects a 4 back.

Now, there are (at least) 3 different implementations of Foo that could take that input, and return that result, addition, multiplication, and exponentiation. Which one is correct?

For this highly contrived example the answer is obvious, another test will clear up the confusion, but one more test isn’t enough, there are edge cases to be aware of. And this is the point that I want to demonstrate. You may never have enough tests to be sure that your code can handle everything thrown at it.

Now, regardless of the testing methodology, the only actual way to be 100% sure that your code will deal with every possible (combination of) input, is to actually give it every possible (combination of) input. An exhaustive test.

That’s not really practical, and, in many cases, the tests being done would be redundant.

What’s really needed, is tests that cover all the possible classes of input your code will need to deal with. Unfortunately, this requires some clear ideas on what classes of input that the code will deal with.

Almost every developer will carry round in their mind experiences of bugs that they had to deal with, and how to avoid those, with tests that prove those bugs will not lurk in their code. For example a developer might recognise through experience that Foo needs to handle Integer Overflow, so a developer will create input that demonstrates how Foo behaves when dealing with values that will cause such a state to occur.

This is helpful, but relies on the experience, wherewithal of the developer, or anyone involved with creation of the software, to know to test for that occurrence.

Some testing ideas allow the computer to generate randomised input, and measure the effects. This is helpful, but relies on chance that the input generated covers classes of input not already explicitly tested for by the developer.

So, ultimately, even though there have been huge advances in testing, the fact of life is, the users of the software are the real testers, and they will be accidentally, deliberately, placing input into the software that may put it into a state the developer never thought of.

Note: This doesn’t mean “don’t bother testing” it means “think of more ways to test, because the users will definitely find input that may not have been considered”.

Shane Howearth

Home

About

Testing, in production