Skip to content

06. Organizing code to enable unit checking

Dave Nicolette edited this page Jun 20, 2015 · 1 revision

If we want to run automated unit tests against a Cobol program, we need a way to isolate the individual paragraphs containing logic that is worthwhile to test at the "unit" level. Of course, we will also test the overall functionality of the program as a whole, in its normal runtime context. One level of automated testing does not obviate the need for other levels of automated testing.

Look for the sample program, CONVERT.CBL. The purpose of the program is to convert an input file into a different format and write the converted data to an output file. It is a simplified example of a very common type of batch program in mainframe environments.

Three versions of the program are provided. CONVERT-BAD.CBL is organized in the same way as many legacy Cobol programs: All the logic of the program is located in the same paragraph. This sort of design is sometimes called spaghetti code. To test this paragraph as a "unit" would be no different than testing the whole program, as there is no way to execute any subset of the logic in isolation. We can only test the whole program, which would be a "functional" or "integration" test, and not a "unit" test.

The second version is called CONVERT-BAD2.CBL. This version is the same as CONVERT-BAD.CBL except that the logically-distinct segments of functionality are visually separated by comments. This version also can't be unit tested, paragraph by paragraph, but the comments help us to see where we could tease the code apart into separate paragraphs. Many legacy programs contain comments like this to help programmers understand the functionality of the code.

The logical points in source code where we can separate chunks of code without breaking the functionality are called seams. The analogy is with the seams in your clothing, which are the easiest places to pull the clothing apart. (Note: It's only an analogy. In most cases, it is more desirable to pull your monolithic code apart than it is to pull your clothing apart.)

The third version is called CONVERT.CBL. This version illustrates the way to organize source code so that the individual paragraphs can be unit tested in isolation, without having to set up a full run of the program with real datasets. Input/output processing and initialization logic are separated from "business logic." Small, discrete pieces of functionality are coded in separate paragraphs. Those paragraphs have no runtime dependencies on anything external to the program. This allows a unit test case to perform a single paragraph without requiring all the program's runtime dependencies to be in place.

Some people who are new to the idea of unit testing worry that a multitude of tiny paragraphs will cause performance problems compared with inline code. There is no need for worry, as the Cobol optimizer is very mature and will inline the paragraphs appropriately. For purposes of human readability and code testability, it's more important to keep the distinct chunks of logic nicely separated than it is to try and hand-optimize the source code for runtime performance. (Note: Yes, you are a highly skilled programmer. No, you will not be able to hand-optimize your code more effectively than the optimizer. Don't waste your time trying.)

Before you can start writing unit tests against legacy Cobol programs, it's possible you will have to reorganize the code in a way similar to changing CONVERT-BAD.CBL into CONVERT.CBL. This sort of modification, which changes the internal structure of code without changing its behavior, is called refactoring. The analogy is with algebra, in which we can refactor expressions in ways that change their form without changing their meaning.