Objects First |
| Colour | Code |
|---|---|
| black | 0 |
| blue | 1 |
| red | 2 |
| yellow | 3 |
| ... | .. |
int colour;
colour = Colour( r );
if ( colour == 0 ) {
/* Process a black rectangle */
...
}
else if ( colour == 1 ) {
/* Process a blue rectangle */
...
}
All very straightforward,
until we receive the latest direction from our colour consultant
"at least 50% of rectangles should be puce in colour"
and we have to change the program
(after we've found out what is actually meant by puce!).
Or we discover that tangerine rectangles are being
processed with the rules for orange ones.
Then we find that
A much better solution is to have a class of colours,
each represented by a symbolic name,
eg black, blue, red, ..etc,
and to be able to use those names in our program:
Colour colour;
colour = Colour( r );
if ( colour == black ) {
/* Process a black rectangle */
...
}
else if ( colour == blue ) {
/* Process a blue rectangle */
...
}
Not only is this much easier to read,
it presents far fewer maintenance headaches.
Adding a new colour involves simply deciding on
a new, meaningful name,
eg puce can be called
puce
(how original!),
and then adding some statements to our program.
No existing program statements should need change,
thus we can be quite confident that,
after we have added the statements to handle the new colour,
processing of all the original colours will still be correct.
typedef enum { black, blue, red, yellow } Colour;
| What's happening here? |
|---|
|
In order to produce a program that runs efficiently,
the C compiler will assign codes to the various symbols
that we have placed in the list of possible values of
objects of this class.
For example, in the Colour example,
the compiler might well start at zero and assign codes 1,2, ... etc to the
symbols as it encountered them.
This of course would result in exactly the same table
as we had originally.
So what have we gained?Easier maintenance and readability!
|
They also have a limited set of sensible operations: basically, ==, != and assignment.
Unfortunately, being a language designed for
hackers, C permits all the operations defined for integers
on enum types.
Thus:
#include <stdio.h>
main() {
typedef enum { black, blue, red, yellow, green, purple } Colour;
Colour a, b, c;
a = black; b = blue; c = red;
if ( a > c ) printf("a>c\n"); else printf("a<=c\n");
if ( b >= c ) printf("b>=c\n"); else printf("b<c\n");
printf("a = %d, b = %d, c = %d\n", a, b, c );
}
is perfectly legal C and executes without any problems.
My Unix system prints:
a<=c
b<c
a = 0, b = 1, c = 2
for the printf statement in this example.
However, ==, != and assignment are required by the standard to work in equivalent ways on any ANSI C implementation. Thus it is best to regard an enumerated type as possessing only these three operations or methods. (Some more strongly typed languages, such as Pascal, Modula and Ada, enforce such a restriction. C is often referred to as weakly typed because of the relaxations in operation usage rules that it permits.)
If you are using an enumerated type in a single program, then the separation of the logical operation of the program from issues of representation (what code is used for "blue"?, etc) presents no problems. Although there is one caveat:
If you alter the definition of an enumerated type, eg to add an extra value, then you must re-compile all the program modules that use that type. In a well-constructed C program, the definition of the class (the typedef enum { ... } ClassName; statement) will be in a header file (.h extension) which is included in all modules which use objects of the class. If a change is made to the definition, then all the modules which import the definition must be re-compiled.
Even though an identical definition of an enumerated type may be imported into two different programs, it is not guaranteed that the representations (actual values used) in the two programs are the same. (Although, generally, the same compiler will produce the same coding from an identical class definition.) This is definitely the case for compilers on different machines (or even different compilers on the same machine!). Thus a problem arises if program A writes objects of an enumerated type into a file (or onto a communications channel) and they are read by program B (on the same or a different machine).
To overcome this problem and to permit wider use of enumerated
types (with the consequent gain to legibity and maintainability
of programs),
C allows us to specify the actual values to be used when
defining the class:
typedef enum { black=1, blue=2, red=3, .. } Colour;
Now objects of the class Colour can be written into files
and read by other programs with no problems.
We have lost the convenience of allowing the compiler to
assign codes to the values our objects can take.
Thus we may have to make some major re-adjustment if we want
to ensure that when we add puce to the class,
the value assigned to it reflects that it lies somewhere in the
pale yellow, brown, pink, .. region.
Of course, if this is not a consideration, then life is simple,
we just add puce = x+1 to the end of the
list, where x is the value allocated to the last item
currently in the list.
However, even a major re-numbering exercise (
occasioned by wanting to locate
puce next to
sickly_yellow), only needs to be done in
one statement.
All other code using the symbolic values is then simply re-compiled.
The only code adjustment is the adding of the special cases for
puce to the program.
Thus it is possible to retain the major advantage of enumerated types - legible, understandable code - with only a small increase in the maintenance effort and allow objects of enumerated types to be passed betweed programs via files and communications channels.
Enumerated types are particularly useful in multiple-choice branching statements or switch statements.
/* Colour.h - Class of colour definitions */ #define BLACK 0 #define BLUE 1 #define RED 2 #define YELLOW 3
#include "Colour.h"
int colour;
if ( colour == BLACK ) {
...
}
else if ( colour == BLUE ) {
...
}
else if ( colour == RED ) {
...
}
..
Although I have a slight preference for the enumerated type,
there are times when the
#define
approach makes sense also.
These are when the codes which you must use are part of the problem
specification and thus have some meaning in themselves.
For example,
suppose that each colour had to be represented by its number
in a fashion designer's catalogue -
because that number effectively defined the colour.
Then, although I want to progam symbolically (and forget the
codes), there are points in the program where the actual
value of the code has real meaning,
for example, when the program tells you to look at the
actual colour - giving its code number now,
because that's what printed beside the reference
example on page x of the catalogue.
The decision to use an enumerated type rather than a defined
constant is easier in
strongly typed languages such as Pascal
or Ada, because the compiler prevents you from
mixing types (it will reject an attempt to compare an
object of an enumerated type to an integer, for example).
Thus the compiler will give you some help to avoid silly mistakes.
For example, in Ada,
TYPE colour IS (black, blue, red, yellow);
....
c: colour;
...
IF c = 23 THEN ...
END IF;
will generate a compiler error,
because c = 23 is an invalid expression:
you can't compare objects of different types.
Unfortunately, C promotes enum's to int in expressions and can't give this assistance!
Key terms |
|
Continue on to switch statements. Back to the Table of Contents |