Higher-Order Testing

Summary

Overview

When you finish unit testing a program, you have really just begun the testing process. Consider this important concept: A software error occurs when a program does not do what its end users expect it to do. You cannot guarantee that you have found all errors, even if you have done perfect unit test. Therefore, to complete testing, some form of further testing is required. These further tests are called higher-order tests.

The following key diagram presents illustrates a software development cycle and the associated testing processes. Note how distinct testing processes are associated with distinct development processes:

The flow of the process can be summarized in seven steps:

  1. Requirements is a written translation of user's needs. These are the goals of the product. In other words, why the program is needed.

  2. Objectives specify what the programs should do, and how well the program should do it.

  3. Objectives are translated into precise product specification, viewing the product as a black box and considering only its interfaces and interactions with the end user.

  4. System design partitions the system into sub-systems, components, individual programs, and defines their interfac

  5. The structure of the program is designed by specifying the function of each module, the hierarchical structure of modules, and the interfaces between modules.

  6. A precise specification is developed that defines the interface to, and the function of each module.

  7. Through one or more sub-steps,  the module interface specification is translated into source-code.

With respect to testing, each testing process is focused on a particular development step. For example:

Note that the sequence of testing processes presented above does not necessarily imply a time sequence. For example, system testing could very well be overlapped in time with other testing processes.

The remaining sections discuss each kind of testing in more detail.

Unit Testing

Definition: Unit testing verifies each individual unit (i.e., class) in isolation by running tests in an artificial environment.

Unit tests represent the lowest level of tests. Unit testing is the first opportunity to exercise source code. This strict form of unit testing is very fine grained and works on a method by method basis. One class is tested at a time, with mocks standing in for helper or dependent classes. Only public methods are covered. By testing each unit in isolation, and ensuring that each works on its own, it becomes much easier to location and isolate problems than if the unit were part of a larger system.

It is possible to use unit testing to perform other types of testing such as function testing and regression testing.

Integration Testing

Definition: Integration testing verifies that the combined units function together correctly.

While Unit Testing represent the lowest levels of tests by testing on a method-be-method level, integration testing verifies the ways in which multiple entities interact with each other. Integration testing facilitates finding problems that can occur at the interface or the communication level between the individual parts. When a problem does occur, it is easier to find the root cause as all newly added components are always the suspect.

Four integration strategies exist, each with its own advantages/disadvantages (the first two were explored in depth in Unit Testing ):

Function Testing

Definition: Function testing is a process of attempting to find discrepancies between the actual program behaviour and the description of the program's behaviour from the end user's point of view.

Function testing is usually a black-box activity where you rely on the earlier unit-testing process to achieve the desired white-box testing logic-coverage criteria. For example, function testing on an FTP client might test whether the client can connect, download, upload, or list a directory.

To perform a function test, the external specification (i.e., the precise description of the program's behaviour from the user's point of view) is analyzed to derive a set of test cases. Equivalence partitioning, boundary-value analysis, cause-effect graphing, and error-guessing methods are especially pertinent to function testing.

Function testing is usually run on whole applications while unit and integration testing is usually run on subsystems or methods. In other words, while unit and integration testing are run at fairly fine grained levels and generally are of interest only to programmers, functional test exist on a higher level and are less technical.

System Testing

Definition: System testing verifies the entire product , after having integrated all software and hardware components, and validates it according to the original project requirements.

System testing is the most mis-understood and most difficult testing process. System testing is not a process of testing the functions of the complete system, because this would be redundant with function testing. The purpose of system testing is to compare the system or program to its original objectives. Given this purpose, there are two implications:

By having a fully integrated system, the tester can evaluate many attributes that could not be accessed at lower levels of testing. Some of the major categories of testing include:

Before discussing each of the above categories of system testing, one of the most vital considerations in implementing the system test should be resolved: who should do the system test? Simple stated, programmers should not perform a system test, and of all the testing phases, this is the one test that the organization/team responsible for developing the software definitely should not perform.

The first point comes from the fact that the person performing the system test must be able to think like an end-user. Obviously then, a good testing candidate is one or more end users, if possible. However, the typical end-user will not have the ability or expertise to perform many of the categories of tests described below, an ideal system-test team might be composed of a few professional system-test experts, a representative end user, and the key original analysts/designers of the program. The second point comes from the fact the a system test is an "anything goes" activity. The development organization/team has psychological ties to the software that are counter to this kind of activity. At the least, the system test should be performed by an independent group of people with few, if any, ties to the development organization/team.

Facility Testing

This is the most obvious type of system testing. Facility testing is the determination of whether each facility mentioned in the objectives was actually implemented. The procedure is to scan the objectives document sentence-by-sentence and determine that the program satisfies each stated objective. Usually, a checklist is helpful to ensure that you mentally check the same objective the next time you perform the same test.

Volume Testing

Volume testing is subjecting the program to heavy volumes of data. For example, a trading system would be given a very huge number of trades to process. The purpose of volume testing to show that the program cannot handle the volume of data specified in its objectives.

Stress Testing

Stress testing is forcing the system to operate with an unreasonable load while denying it the resources needed to process that load. The goal is to push the system beyond the limits of its states requirements to find potentially serious bugs. Stress testing should not be confused with volume testing.  For example, if the software under test was Microsoft Word, then a volume test would be to load a 10,000-page document, while a stress test would be to open 50 normal documents (say 10-20 pages) under low memory conditions.

Stress testing is often common with web-based applications. Here you want to ensure that your application and hardware can handle some volume of concurrent users. You could argue that you may have millions of users accessing the system at the same time, but this is not reasonable. Therefore, you need to understand the intended audience when designing a stress test.

Usability Testing

Usability testing is assessing the program's user-friendliness and identifying operations that may be difficult for the users. The following represents some items that may be tested for usability:

  1. Has the UI been tailored to the intelligence, educational background and environmental pressures of the end user?
  2. Are the outputs of the program (especially errors) meaningful and easy to understand?
  3. Do all screen exhibit conceptual integrity and uniformity of syntax, conventions, format, style and abbreviations?
  4. etc.

Security Testing

Security testing verifies that only authorized users have access to allowable features. The purpose of security testing is to devise test cases that subvert the program's security checks. One way to devise such tests is to study known security problem in similar systems and generate test cases that attempt to demonstrate similar problems in the system you are testing.

Web-based applications often need a higher-level of security testing that do most applications, especially in e-commerce systems. 

See Security chapter in Distributed Design.

Performance Testing

Performance testing measure how long each task takes, under normal and peak conditions, to ensure that the system responds within the specified time constraints.

See Performance chapter in Distributed Design.

Configuration Testing

Configuration testing assesses whether the software operates under the stated configuration settings. For example, a trade server may be configured to read files from a memory queue or from a database, to turn tracing on/off, to commit/rollback all database transactions, to send or ignore notifications, and so on.

Reliability Testing

Reliability testing verifies that the system operates under stated conditions for a specified time period. The goal of all types of testing is to improve software reliability, but if the program's objectives contain specific statements about reliability, the specific reliability tests are required. Testing reliability objectives is difficult. For example, how would you test a reliability objective which states that the program has a targeted uptime of 99.0% over the life-time of the system?

See Reliability chapter in Distributed Design.

Recovery Testing

Recovery testing verifies that the system can recover to a usable state after having experienced a crash, hardware failure, or similarly damaging problems. Simulating each of these problems is usually quite easy (invalid pointers, disconnected hard drives, etc).

Often, one design goal of critical system is to minimize the mean-time to recovery MTTR). The MTTR will usually have an upper and lower boundary, so your test cases should reflect these bounds.

Serviceability Testing

Serviceability testing ensures that any internal maintenance information such as traces and diagnostics messages, work as documented.

Documentation Testing

Documentation testing ensures that all user and system documentations reflect the state of system. One possible way to do this is to use the documentation to determine the representation of test cases for each of the afore-mentioned system tests. For example, you would use the documentation as a guide for writing test cases for stressing the system.

Procedure Testing

Procedure testing ensures that all manual procedures performed by people to ensure the correct operation of the system, are correct. Often, a large system may contain programs that are not completely automated and may require a human intervention for correct operation. For example, a database application may require manual operations for rolling back transactions. Any prescribed human procedures, such as procedures for the system operator, database operator, or even end-user, should be tested during the system test.

Regression Testing

Definition: Regression testing consists of running previously executed tests on new code to ensure that features that worked in previous versions still work as expected

Changes might be the addition of new features, the removal of an existing feature, or simply the refactoring of existing code with adding or deleting any features. Regardless of the type of the change, we run regression tests to ensure that the change did not break any existing code.

Unit, integration, function, stress and acceptance testing can all be run as regression tests. And as such, regression testing is not another type of testing, but rather it is another way of looking at the test. Unit tests of all kinds are frequently run as regression tests. By automating and viewing the results of these unit tests, we can confirm that nothing has been broken by recent changes to the code base.

Acceptance Testing

Definition: An acceptance test validates the system against the user's requirements and ensure that the application is ready for operational use.

All the previous tests (unit, function, performance, etc)  help the developer get the application ready to pass an acceptance test. As such, acceptance testing is often performed by the customer to determine whether or not the application meets the customer's requirements. Acceptance testing is not considered the responsibility of the development team/organization. Acceptance testing often includes some form of function and performance testing. Issues such as usability, look and feel, etc. are also part of this test.

Installation Testing

This is an unusual kind of testing as its purpose is not to find software errors, but to find errors that can occur during the installation process. Many events usually occur the installation process, for example:

  1. Selecting a variety of options. Certain options may enable or disable others.
  2. Finding and locating files and libraries
  3. Verifying required hardware configurations.
  4. Establishing any required Internet/Intranet connectivity,
  5. etc.

Test cases must not only cover all of the above, but also to ensure that the installation procedures for full, partial, or upgrades work as documented as well as to test the uninstallation process.

Test Planning

Ideally, there should be a test plan at every level of testing, from unit testing, through integration to system testing. A test plan describes the requirements, resources, strategies, and schedule for testing an application. A good test plan contains the following components shown in the table below.

Test plan component Description
Objectives Objectives of each testing phase must be defined.
Completion Criteria Criteria to specify when each testing phase will be complete.
Schedules Time schedules for each testing phase.
Responsibilities For each phase, the people who will design, write, execute and verify test cases, and the people who will fix discovered errors, should be identified.
Tools Required test tools must be identified (often NUnit and NMock)
Equipment resources What machines and test benches are required.
Hardware configuration If special hardware configurations are required, description of these hardware requirements, why they are needed, and how they will be met.
Integration Describes how different parts of the program will be pieced together (for example, incremental top-down testing).  For large system, a system integration plan may be necessary. The system integration plan defines the order of integration, the functional capability of each version of the system, and responsibilities for producing mock objects that simulate non-existent components.
Tracking procedures Track the overall progress of testing, including the location of error-prone modules and estimation of progress with respect to the schedule, resources, and completion criteria.
Debugging procedures

Mechanisms must be defined for reporting detected errors, tracking the progress of corrections, and adding corrections to the system.