The most important aspect of program testing is the design and use of effective test cases. Test-case design is so important because complete testing is impossible. Given constraints on time and cost, the key issue designing test cases is:
What subset of all possible test cases has the highest probability of detecting most errors?
In general, the least effective methodology of testing is random-input testing - the process of testing a program by randomly selecting a subset of all possible input values. As was shown in Economics of Testing section, exhaustive black-box and white-box testing is, in general, impossible. The main approach to designing test cases will be to use black-box-oriented test-case-design methodologies, along with supplementary test cases as necessary with white-box methods. The methodologies discussed are:
White Box
Note: Give a statement such as (( a > 1 ) && ( b < 0 )), the term ((a > 1) && (b < 0)) is called a decision while each constituent part, i.e., ( a > 1 ) and ( b < 0 ), is a called a condition.
Black Box
The ultimate white-box test is the execution of every path in the program, but complete path-testing is not a realistic goal. You might try to downgrade complete path-testing to statement-coverage, that is executing every statement in the program at least once. However, this criterion is so weak that it is generally useless. For example, consider this simple C# method:
public void TestValues( int a, int b, ref int x
)
{
if (a > 0) && (b > 0)
x = 1;
if (a < 0) && (b < 0)
x = -1;
}
To execute every path, you would need two test cases, one where a = 1 and b = 1 and another where a = -1 and b = -1. But what if the first decision for a should have been >= rather than > ? How would statement-coverage detect this logic error? Statement-coverage will not be able to detect such errors.
A stronger logic-coverage criterion is decision coverage. This criterion states that you must write enough test cases such that each decision (if- else, switch, while, do) has a true and a false outcome at least once. Decision coverage usually can satisfy statement-coverage since every statement is on some sub-path emanating either from a branch statement or from the program entry point. But because statement-coverage is a necessary condition, decision coverage should be defined as follows: decision-coverage testing requires that test cases be written such that each possible outcome of all decisions is exercised at least once, and that each statement be executed at least once. For the above program, decision-coverage can be met with the previous two test-cases: one where a = 1 and b = 1 and another where a = -1 and b = -1 as shown below:
Test Case | (a > 0) && (B > 0) | (a < 0) && (B < 0) |
a = 1, b = 1 | true | false |
a = -1, b = -1 | false | true |
Decision coverage is a stringer criterion that statement coverage but it is still weak. What if the first decision for a should have been >= rather than > ? How would condition-coverage detect this logic error? Condition-coverage will not be able to detect such errors.
A criterion that is stronger that decision coverage is condition-coverage. In condition coverage, you write enough test cases to ensure that each condition in a decision takes on all possible outcomes at least once, and that each statement is executed at least once. In the code above, there are four conditions: a > 0, b > 0, a < 0, and b < 0. Again, it turns out that that the same two test cases, one where a = 1 and b = 1 and another where a = -1 and b = -1 satisfy condition-coverage as shown in the following table:
Test Case | a > 0 | b > 0 | a < 0 | b < 0 | (a > 0) && (B > 0) | (a < 0) && (B < 0) |
a = 1, b = 1 | true | true | false | false | true | false |
a = -1, b = -1 | false | false | true | true | false | true |
However, what if we had two test cases where a = 1 and b = -1 and another where a = -1 and b = 1 ?
Test Case | a > 0 | b > 0 | a < 0 | b < 0 | (a > 0) && (B > 0) | (a < 0) && (B < 0) |
a = 1, b = -1 | true | false | false | true | false | false |
a = -1, b = 1 | false | true | true | false | false | false |
Obviously these test cases cover all condition outcomes, i.e., all conditions such as a > 0 have true and false outcomes. However, only two of the four decision outcomes are covered. In other words, (A>0) && (B>0) is never true and (A>0) && (B>0) is also never true. In other words, condition coverage does not necessarily satisfy decision-coverage. This problem is addressed by decision/condition coverage. In decision/condition coverage you write test cases such that each condition in each decision takes on all possible outcomes at least once, each decision takes on all possible outcomes at least once, and each statement is executed at least once.
However, a problem with decision/condition coverage is that although it may appear to exercise all outcomes of all conditions, it frequently does not because certain conditions mask others. For an instance, if an and condition is false, none of the subsequent conditions in the expression need to be evaluated as the expression will have a value of false (false anded with anything else is false). Likewise, if an or condition is true, none of the subsequent conditions in the expression need to be evaluated as the expression will have a value of true. Therefore, errors in logical expressions are not necessarily revealed by the condition coverage and condition/decision coverage criteria.
A criterion that solves this problem is multiple-condition coverage. This criterion requires that you write sufficient test such that all possible combinations of condition outcomes in each decision, and each statement, are invoked at least once. For the code above, test cases must cover the following combinations:
A > 0, B > 0
A > 0, B <= 0
A <= 0, B > 0
A <= 0, B <= 0
A < 0, B < 0
A < 0, B >= 0
A >= 0, B < 0
A >= 0, B >= 0
To summarize:
For statements containing only one condition per decision, a minimum test criterion is a sufficient number of test cases to 1) evoke all outcomes of each decision at least once, and 2) invoke each statement at least once.
For statements containing multiple conditions per decision, a minimum test criterion is a sufficient number of test cases to 1) evoke all possible combinations of condition outcomes in each decision, and 2) invoke each statement at least once.
We previously said that a good test case is one that is very likely to find an error, but we also said that exhaustive-input test of a program is impossible. Therefore, you are limited to trying a small subset of all possible inputs. Of course, you want to select the right subset, i.e., the subset with the highest probability of finding an error. One way of locating this subset is to realize that a well-selected test case is not just one that specifies input and expected result, but also has the following two properties:
It reduces by more than one the number of test cases that must be developed to achieve some predefined goal of 'reasonable' testing.
It covers a large set of other possible test cases.
The first property implies that each test case must invoke as many different input considerations as possible to minimize the total number of test cases necessary, The second implies that you should try to partition the input domain of a program into a finite number of equivalence classes such that you can reasonable assume that a test of a representative value of each class is equivalent to a test of any other value. That is, if one test case in an equivalence class detects an error, all other test cases from the same equivalence class would be expected to find the same error. Conversely, if a test case did not detect an error, no other test cases from the same equivalence class would be expected to find the an error.
These two considerations form a black-box testing methodology known as equivalence partitioning. Equivalence partitioning can be defined as partitioning (dividing) input conditions into a set of classes where each class is equivalent (representative) of a large set of other possible tests. Test-case design using equivalence partitioning has two distinct steps: identifying Equivalence Classes (EC), and defining test cases:
Identifying Equivalence Classes (EC)
For each input, identifying the ECs is largely a heuristic process. A set of
guidelines is shown below:
If the input condition is a range of values (i.e., ID must be between 1 and 99999), identify one valid EC (ID between 1 and 99999) and two invalid ECs (ID < 1, and ID > 99999).
If the input condition specifies a number of values ( i.e., there can be 0 to 10 managers), identify one valid EC (ManagerCount between 0 and 10), and two invalid ECs (ManagerCount < 0, and ManagerCount > 10).
If the input condition specifies a set of input values and there is reason to believe that the program handles each one differently (Language must be C#, C++ or C), identify a valid EC for each input and one invalid EC (Language is Java).
If the input condition specifies a 'must be' situation (First character of identifier must be numeric), identify one valid EC (first character is numeric) and one invalid EC (first character is non-numeric).
If there is reason to believe that the program does not handle
elements in an EC identically, split the EC into smaller ECs.
Defining the test cases
The process is as follows:
Assign a unique identifier to each EC.
Write test cases to cover all the valid ECs. A single test case can cover more than one EC.
Write test cases to cover all the invalid ECs. A single test case can cover one and only one invalid EC.
Boundary value analysis is another black-box testing methodology in which special attention is paid to values at or near a boundary. Boundary value analysis is used extensively when testing numerical ranges, mathematical calculations, buffer sizes, and any other feature affected by some type of a limit.
The typical tests defined at a boundary consist of these three conditions:
Boundary value.
Boundary value - 1.
Boundary value + 1.
This means that applying boundary value analysis to a bounded range (which includes both lower and upper boundaries) results in the following test conditions:
Minimum.
Minimum - 1.
Minimum + 1;
Maximum.
Maximum - 1;
Maximum + 1.
Middle (or typical) value
Consider a user-login screen in which a user enter his login name in one field and his password in another field. The following table shows how a boundary value analysis applies to this dialog box:
Boundary Value Condition | Initial state of Username or Password Field | Expected result after typing an additional character |
Minimum | 0 characters (empty field). | 1 Character in the field. |
Minimum - 1 | Not possible. | Not possible. |
Minimum + 1 | 1 character in the field. | 2 characters in the field. |
Maximum | maximum number of characters. | Application ignores extra characters. It is not possible to type any more characters. |
Maximum - 1 | maximum -1 number of characters. | maximum number of characters. |
Maximum + 1 | Application does not allow the entry of more characters that exceed the maximum allowable | Not possible |
Typical value |
minimum + 2 through maximum - 2 |
Application accepts new characters |
Boundary conditions are those situations directly on, above, and beneath the edges of input ECs and outout ECs. Boundary value analysis is actually a refinement of equivalence partitioning with two major differences - 1) elements are selected such that each edge of the EC is the subject of the test, and 2) rather than focusing exclusively on input conditions, output conditions are also explored by defining output ECs.
The second difference above, examining the boundaries of the result space, is quite important because it is not always the case that the boundaries of the input space represent the same set of circumstances as the boundaries of the output ranges.
The guidelines for boundary value analysis are:
If an input specifies a range of valid values, write
valid-input test cases for the both ends of the range and invalid-input test
cases for conditions just beyond the ends. For example, if the input requires
a real number between 0.0 and 10.0, write test cases for 0.0, 10.0, -0.1, and
10.1.
If an input specifies a number of valid values, write
valid-input test cases for the minimum and maximum number of values and one
invalid-input test case outside the allowed minimum and maximum numbers. For
example, if a timesheet input requires the entry of at least 3 days but no more
than 7 days, then write test cases of 3 and 7 and for 2 and 8.
Use guideline 1 for each output condition. For example, if a
program computes the number of pending trades in a queue, and if the minimum is
0 and the maximum is 10,000 trades, then write a test case that cause the output
to be 0 trades and another test case that cause the output to be 10,000 trades.
Also see if it possible to write test cases that might cause a negative number
of trades or a number of trades > 10,000.
Use guideline 2 for each output condition. If a trade finding
screen can display information for 1, 2, 3, or 4 trades but not more, write test
cases such that the program displays 1 or 4 trades, and write test cases that
might cause the program to display 5 trades.
If the input or output of a program is an ordered set (i.e., a sequential file, an array, or even a table), focus attention on the first and last elements of the list.
Boundary value analysis is not as simple as it sounds as boundary values may be difficult to identify boundary conditions. Boundary value analysis becomes more complex for combinations of inputs.
One weakness of boundary value analysis and equivalence partitioning is that they do not explore combinations of input conditions. The testing of input combinations is not a simple task because even if you equivalence-partition the input conditions, the number of combinations is usually very high. If you do not have a systematic way of selecting a subset of input conditions, you'll probably select an arbitrary subset of input conditions, which can lead to an ineffective test-case.
Cause-effect graphing is a systematic approach to selecting a high-yield set of test cases that explore combinations of input conditions. It is a rigorous method for transforming a natural-language specification into a formal-language specification, and exposes incompleteness and ambiguities in the specification.
Cause-effect graphing is useful in functional (external) specification because it exposes errors in the specification. However, it is difficult and time-consuming to implement. Refer to 'The Art of Software Testing' book by G. Myers for a full example.
Error-guessing is an ad-hoc approach based on experience and intuition to identify tests that are considered likely to expose errors. The basic idea is to come up with a list of possible errors and error-prone situations and then write test-cases based on the list.
So how do you come up with such a list. Bug lists and defects reports are quite useful in identifying potential errors. However, since an exact procedure cannot be given, the best thing is to present a simple example. If you are testing a sorting routine, you may consider exploring the following situations:
The input list is empty.
The input list contains one entry only.
The input list is already sorted.
All entries in the list have the same value.
In other words, you list those cases that may have been missed with the program was designed.
All the test-case design methodologies discussed can be combined into an overall strategy:
If the specification contains combinations of input conditions, start with cause-effect graphing.
Use boundary-value analysis for input and output boundaries.
Identify valid and invalid equivalence classes for the input and output.
Use error-guessing techniques to add additional tests.
Examine the program's logic with regard to the set of test cases. Use decision-coverage, condition-coverage, decision/condition coverage, or multiple-condition coverage criterion.