Science, Mathematics and Unit Testing

I have always considered testing to be an important part of quality software development, however I find the focus on unit testing in commercial environments is worrying, particularly when combined with junior developers.  What has this to do with science and mathematics?  Read on…

Ultimate testing objectives

The final aim of any test is to prove (beyond reasonable doubt) that a software asset (the type of asset typically depends on the kind of test) is fit for purpose.  This therefore leads to the conclusion that testing is all about proofs.

Proofs

The methods of arriving at a proof are related to other areas of engineering:

  1. Formal proof based on mathematical reasoning — the epitome of pure mathematics
  2. Statistical verification
  3. Educated guesswork
  4. Uneducated guesswork!

Unfortunately, much test development is left to the fourth of these!  Let me clarify.

Formal methods for software development do exist and they are commonly applied in certain specialist areas.  These methods are regularly used in algorithmic analysis and design but they do not easily fit todays large distributed systems in commercial environments.  It is not that they cannot be used, but rather that they are often too expensive to apply in general.  However, they should be considered in critical situations such as security, transactional behaviour boundaries, life threatening situations (such as medical and aviation devices) and the like.

Statistical verification is very much related to science.  The proof is reasonably achieved by a careful demonstration of a probabalistic outcome, similar to empirical analysis.  This depends on an understanding of the formal requirements of the software unit:

  1. Preconditions state what the software unit can assert
  2. Postconditions state what the software unit must cause as output and/or change of state

True statistical analysis does not require any more than this, however the range of input and state permutations along with the relative probabilities of each are typically included (however informally) in the analysis.  This information is used to direct a statistically significant set of test cases against the software unit.  Note that this approach does not require information about the internal implementation of the software unit.

One area of particular statistical importance is the concept of an edge case.  This presumes that there is a continuous range of inputs to a software unit and that these edge cases represent input cases that mark a distinct change of behaviour in the software unit.  Some edge cases are prescribed in the formal statement of the problem, others are implied by the technical environment, others are standard modes of failure and a further set are specific to the design or implementation of the solution.

The third method, educated guesswork, presumes relevant education and this is the problem.  More often than desirable (at least) the test designer is actually the software developer and is naturally biased.  Worse, the developer has knowledge of the internal details of the software unit and directs tests against that knowledge.  A testing approach based on educated guesswork will apply the full knowledge of both statistical methods and formal analysis in an attempt to assess likely edge cases and failure modes and to design tests to determine the behaviour in these cases, along with a sample set of standard cases to illustrate correct, normative behaviour.

The final method is uneducated guesswork and can be further split into two categories: where the implementation is known and where it is not.  The former case allows for good tests of correct behaviour but does not assist in detecting unanticipated failure modes.  It also results in tests that are tightly coupled to the actual implementation.

Unit testing

Unit testing is a testing approach where the smallest coherent part of a software system is tested independently.  In order to achieve this across large projects with complicated dependencies, two major decisions are often applied:

  1. The use of testing frameworks, including mocks and stubs (see http://martinfowler.com/articles/mocksArentStubs.html)
  2. The software developer is typically the primary unit test developer for his/her own code

Although other developers may be used to test the code, these are often part of the same development team and this is subject to team and organisational  blind spots.  It does not in itself address the shortcomings of a developer testing his/her own code (although it may still have some limited benefit).

The testing frameworks are an attempt to reduce the scope of work under test.  In theory, this allows the tests to be more carefully targeted in order to write simpler tests.  However, this has a negative impact (also described in part by Martin Fowler — see above).

  1. Behaviour driven approaches (such as some mock frameworks) are tightly coupled to the software unit’s implementation, not necessarily its formal contract
    • This type of testing may be appropriate when a software unit is formally described by a UML sequence diagram, however
  2. Data abstraction approaches (such as the use of the Repository pattern) may fail to correctly model subtle aspects of underlying dependent components
  3. The need for tests to be statistically significant is not addressed

Testing software units well is not easy.  A poor application of unit testing can lead to a false sense of security and a failure to adequately assess core code.  It can be useful as an aid in checking for breaking changes and for automated testing.  However, the effort required to produce suitable unit tests is substantial and proportional to the combined complexity of a software unit, its inputs, its possible states and its dependencies.  Unit testing using these frameworks and approaches does not actually lead to true proofs of code quality.

The service orientation advantage

A major advantage of service orientation is that it positions agnostic service logic (logic that does not relate to a specific business process) into separate services.  These are independently testable, are autonomous and have clear contracts.  These are therefore prime candidates for high quality testing by educated testers.

Conclusions

I do not advocate the total removal of unit testing from software development, but perhaps there is scope for careful targeting of test development resources.  Junior developers need to be trained to see the full range of test requirements and to avoid coding-for-test (much like the teaching-for-exam phenomena in education).

Above all, let us be a pragmatic and thinking profession where we use our tools to achieve real returns and not just because we have been told that they are (abstractly) good.

Leave a comment