Smalltalk- An almost complete testing environment

Author: Philip Haynes, Object Oriented Pty Ltd (p.haynes@oose.com.au)

A mature view of software testing is to see it as a process of reducing the risk of software failure in the field to an acceptable level [Bezier 90]. Whilst modern industrial Smalltalk development environments provide strong support for many of the major subprocesses required for software production (including detailed design, code development, configuration management, build environment and internal documentation), they currently lack support for the verification phase of software development. Although less glamorous than many of the other phases of software development, software testing is critical for the development and deployment of industrial software systems. Unless the Smalltalk community can properly address this issue, the language and its environment is destined to remain an interesting research and prototyping tool. After all, would you risk your life, bank account or business on a poorly tested / unverifiable software system?

A great strength of Smalltalk is its integrated development environment. It is this author's position that this environment should be built upon to support software testing. To make the required extensions would not require major changes to the current development environment. This paper details the major tasks required to effectively test software and very briefly highlights how the Smalltalk development environment can be modified to support these tasks.

Effective testing of software requires:

Definition of test cases (Both model and GUI code);
Ascertained the effectiveness and completeness of tests;
Using these test cases, ie the quality of software needs to be determined / measured;
Any defects in the software and / or the test cases should be identified and fixed; and
When the number of defects falls to an acceptably low level, the software can be shipped.

From this it can be seen that to adequately support software testing, both technical and management issues must be addressed. At both levels, the collection of quantitative data is required, since only with measurement does it become possible to make any sort of reasonable risk assessment. In looking at the steps in the testing process, step 3 is the logical extension of step 2. Step 4 is already well supported by Smalltalk environments (with integrated debugging and coding facilities). Determining when to ship involves management risk and / or cost benefit assessment. The key areas where Smalltalk lacks is in steps 1 and 2, ie. test case definition and effectiveness determination. These will be now be discussed.

Test Case Definition - Model Code

At the test case definition stage of software testing, Smalltalk provides a strong foundation for defining test suites for model code. A unit test suite should be defined using a testing framework with the test harness stored in a sibling (Envy equivalent) application. That is, for each group of classes stored in an application, there should exist another application which tests it. Each time a test is run, the execution of the test should be recorded and the test results listed with other configuration management information. To support such a facilities the Smalltalk community should create an industry standard testing framework which automatically records testing artefacts such as:

Test execution;
Tests results;
Path coverage of test;
Complexity and size of the code being testing; and
Defect injection, detection and removal rates.

Test Case Definition - GUI Code

A problem with GUI testing is frequently the volatility of the screen layout, making the creation of regression test suites both expensive and time consuming. A possible way in which the cost of this type of testing could be reduced would be to take advantage of the integrated environment and build test cases up from basic widget events. Whist not providing a complete GUI testing solution, the actual widgets and the events which they generate change much less often than the actual visual appearance of the screen. This means that this type of testing is at least feasible, and thus more likely to get done, as opposed to being so expensive that only partial manual testing is ever done.

Whilst such a tool does not yet exist, the creation of a tool to capture widgets and the events they generate would not represent a major effort. (Approximately 12-18 MM for a commercial product).

Effectiveness and Completeness of Testing

With test cases built into the environment, the completeness and effectiveness of tests can be assessed by looking at the coverage of the tests (as measured by code coverage tools), the rate of defects arrival and how often the test cases are run. This can then be correlated against other measures such as the complexity of the classes and the rate of code change to provide an understanding of the effectiveness / completeness of the testing.

Summary

The above discussion has highlighted four main areas in which the Smalltalk development environment should be extended to better support testing. These areas are:

A standardised test harness;
GUI widget and event testing tool;
Code coverage tools; and
Integrated defect tracking and reporting.

However it cannot be forgotten, that good testing is not just a technology problem, but also a technical management one. Consequently, such extensions should be supported with the development and publication of appropriate testing processes.