CS 1.5 The Course In Between - Or Why Less May Be More

Michael Kuttner
Department of Computer Science
Langara College

ABSTRACT

At Langara College we have a first course in program design (CPSC1150) that focuses on procedural abstraction using C++ but without classes. We also have a clone of UBC’s Scheme based CPSC128 and in second year a traditional Data Structures course (CPSC2150)

However, between 1150 (with C++) and 2150 (C++ and Java) we have a second course in program design (CPSC1160.) This course concentrates on Data Abstraction and Data Representation, a thorough exposure to basic object technology using C++. The course teaches classes and linear data structures but not inheritance or non-linear data structures.

When first taught, it seemed like a typical course for the second half of first year, but its effect on its students made us think again. We notice that students who complete this course successfully frequently claim to have their programming abilities transformed, though up to 50% withdraw before completion.

This paper describes how students achieve a level of software programming competence through lectures, three substantial assignments and associated team labs designed to support them by contributing directly to their final programs. Many of the labs, quizzes and exams involve creating ADT interfaces without much in the way of C++ class code.

In spite of having fewer topics than a traditional sceond programming course, the students do not complain that they are short of something to do. The paper describes why, with sample details of the three larger assignments. Assignments are completed individually, but the weekly labs are tackled in groups of three. These produce code that is incorporated into assignments and I mark them myself to enforce good design and style standards.

The course emphasizes design a great deal (though students learn enough C++ too) such that they tend not to get stuck on later software design in their other ACST courses. I believe the approach is worth noting simply because of student results and student feedback.

Keywords: programming software design object interface adt anecdotal

1. Introduction

I have spent the last several years teaching Langara's CPSC1160, a second course in program design, using C++. The course is a fairly close match to CS102I (I for imperative) among the ACM's multiple curriculum threads [1], but without trying to treat inheritance at all, rather substituting various useful applications and concentrating on designing ADT interfaces. It is quite a demanding course, but I think that the only reason for presenting a paper about it is because of the long-term beneficial effects on students who have survived to go on in our and other institutions' programs.

1.1 Langara's ACST Diploma

This "two-year" diploma in Applied Computer Science and Technology is a tough program combining both software and hardware courses, most of which are built to be university transferable. The main sequence of introductory programming courses are CPSC1150 (C++), CPSC1160 (C++) and CPSC2150 (Java and C++), Program Design I, II and Data Structures. 1150 has slightly fewer topics than CPSC122 at UBC or CMPT101 at SFU, the local universities to which these courses transfer. CPSC2150 transfers to the current UBC CPSC216, its Data Structures opposite number.

CPSC1160, the subject of this paper is "the course in the middle" usually worth 3 unassigned credits. I call it that because there is no equivalent course at either UBC or SFU, whither most of our students transfer. This means that there is time to cover some topics in a more leisurely and thorough manner. This course is the first required course for students starting ACST. All of our courses are 6 hours a week (4 lecture + 2 lab.)

In addition, ACST requires students to have taken the CPSC survey course 1050, because it contains a Javascript component. They must also take CPSC1181 concentrating on Java with UML, 3 more unassigned credits, and CPSC 1180, with Scheme, for people who plan to transfer to UBC this year.

1.2 How this Version of the Course Arose

The development of CPSC1160 followed some historical accidents. Langara used to have two procedural programming courses, such as one in Pascal, one in Modula; or 1.75 in Modula plus some C/C++ and so on. OOP was introduced in the Data Structures course to some howls from students as their paradigm shifted.

We thought that, without taking the complete plunge into "Early Objects", we should start to introduce basic object concepts and of course classes, since we switched to C++ as first programming language in 1995.

In 1997, I inherited CPSC1160, which employed the very useful Headington & Riley book [4] as its text, at the very moment that D.C.Heath was bought out and the book vanished below the horizon.

We adopted Main & Savitch [5] as a "lesser evil" but did not find it usable to discuss design. Thus, I constructed the course out of the bunch of topics that I thought were important in the course. The following term we dropped M & S, as H & R rose like a phoenix from Heath's ashes. However, by then, I was quite fond of the topics in the course, but I thought it was a course just like my predecessors had taught – another second term programming course.

Over the years though, through student feedback, I discovered that it was different, and I figured that the differences lay in its concentration on writing interfaces for the ADTs that were being developed in the assignments, and the labs which were designed to support them.

If there is anything unique here, I hope it is because I had the freedom to develop a course that utilises the following:

1.3 Guiding Principles

I believe that the sequence of Algorithms and Procedural Abstraction, followed by Data Abstraction and Representation, then Object Oriented Programming, and Data Structures, is still largely appropriate, though it is clear that there are other quite workable models out there.
I believe that there is a maximum rate at which students can absorb the concepts of program design, and that these concepts must be motivated, supported and amplified by programming projects of sufficient complexity; or we will be wasting the students' time. If material is presented faster than it can be absorbed, we may as well not bother to present it.
Concepts without syntax results in very little knowledge at the end of the course. Syntax without concepts results in genuine knowledge of syntax that lasts for a few days or weeks at most after the final exam.
Neither concepts nor syntax are very profitable to Computer Science students without lab or assignment exercises to amplify them. I taught third year AI at UBC, where I asked the students to compare alpha-beta pruning (just learned) with using hash tables. En masse, they told me that they did not know how to hash. "Did you not learn that in Data Structures?" "Yes, but we had no assignment on it!"
I have had a tendency (over more than 20 years) to reduce somewhat the number of topics in courses in order to increase the depth of coverage. It is often noticed that more topics equates to shallowness of understanding among our students.
I strongly subscribe to the philosophy espoused in the first edition preface to Abelson and Sussman's MIT text [2], where "... we believe that the essential material to be addressed by a subject at this level is not the syntax of particular programming-language constructs, nor clever algorithms for computing particular functions efficiently, nor even the mathematical analysis of algorithms and the foundations of computing, but rather the technique used to control the intellectual complexity of large software systems."
I do not believe that most introductory texts pay more than lip service nowadays to the concept of and development of algorithms. I still have a residual impression that they are the central concept of computer science (see for example Berlinski [3]. )
I do not believe that most programming texts spend a lot of time on the design process. I once, while in a state of temporary insanity, read the whole of an introductory text that mentioned "Problem Solving" in its title. I found four paragraphs vaguely related to design in the whole book. I believe that the assumption by many authors and many instructors is that people already know how to design programs, so that it is necessary just to show them how to spell their designs in the langage du jour (something that particular book does quite well, I concede.) I claim that it is precisely how to create an algorithm and how to design a solution to a problem that are the skills that students need but with which they do not have experience.
It is my observation that a brief overview of ADTs, followed by a heavy dose of class syntax, causes all knowledge of the real purpose of ADT interface design to evaporate.

2. Course Content

2.1 Topics

   1.    Review of previous course.
   2.    Guidelines for Algorithms, Design, Analysis, Testing, Documentation and Code.
   3.    Program Organisation: Modularisation, Libraries.
   4.    Abstract Data Types and Objects.
   5.    Representation Tools provided by C++: Structs, Classes, Enumerated Types, Unions.
   6.    Container Classes & their Implementation: Linear Data Structures.
   7.    Algorithm Efficiency.
   8.    Pointers, References and Dynamic Data.
   9.    Building Classes with Dynamic Data.
   10. Simulation.
   11. Recursive Applications.

2.2 Discussion

Review: The first time I taught 1160, I assumed the students knew what was in the previous course. My review lasted only a few days and only 4 students in one of the sections ended up with a C or better.

Quiz #0, on the first day, establishes to everyone's satisfaction that (a) students cannot say what an algorithm is, and (b) they cannot describe one for a simple problem. This result is from students who were forced into the rote process of writing an algorithm for every lab program they wrote. They agree that what they learn is to write the program first and the algorithm later, thus subverting the whole process. Students have used software that generates pseudocode from C++.

I now take up to two weeks amplifying (and casting new light on) previous material, and more students survive.
Guidelines: I give a bunch of course rules both for algorithm design, documentation, and program layout far tougher than anything they have suffered from before. Program code will be self-documenting or they die! Also there is a presentation about the Software Development Life Cycle with homilies about 90% debugging, 80% maintenance and so on.

For algorithms, I try to convince students, in spite of what they see in books, that most people don't write complete, working algorithms (or programs) from scratch. Instead, it is more useful to write a series of progressively more detailed algorithm sketches, writing a prototype program to test the main logic, which usually tells us how we should write a better algorithm.
Modularisation, Libraries: Organisation of larger programs, hiding of data and implementation. After presentation of these raw concepts, I have tried asking exam questions about Coupling and Cohesion to not much avail.
Abstract Data Types and Objects: The central topic of the course, comparable to importance of Procedural Abstraction in CPSC1150. I approach this by considering some larger problem like simulating traffic intersections or the birth of galaxies, and asking if they fancy writing a C++ (substitute any language here) program to do that. The answer is "No!" because the language has no operations that know about traffic intersections. Now we find that a type is data and operations, and that many languages allow us to revise themselves to make a type that will be useful.

Now we spend a lot of time finding out about ADTs, in particular the design of an interface to raise the language off its floor. This is "fantasyland" time, where we start to collect operations that would help us improve the language enough, so that eventually the answer is "Yes, I would not mind writing a main program now."

When we practice on real problems, mutators are the operations that come to mind first, so then we try to build categories not just of mutators but also constructors and the several varieties of access or, as we discover there are no built-in operations that work with our new type. They gradually become convinced not to worry about data structures or implementations while they do this invention of operations.

In this phase, students are made to write descriptions in English (such a painful activity) of interface operations before they get close to writing code. For each one, they must provide a description for Name, Purpose, Inputs + Preconditions, and Outputs + Postconditions.

I notice in labs and exams that students with previous experience from other institutions cannot refrain from writing many special symbols in their interface writeups (like ; // () int & and so on ) This behaviour is quite revealing about their training even while I am busy trying to eliminate it.

Then, eventually, we need to think about C++, our target language.
Representation Tools provided by C++: Structs, Classes, Enumerated Types, Unions. Having learned that we want to hide data as well as implementation code, we discover that C++ makes the ideal world a little hard to achieve by using upgraded structs. The data normally sits right in the interface with the notation "private" to tell you that you can't see it. I do want to maintain the distinction between what we want (data hiding) and what we get in the real world of C++ objects. At least C++ lets you separate implementation if you want to.
Container Classes: Sets, Bags, Stacks, Queues and Linked Lists ADT are discussed. Two Bag programs are built, examined and compared in detail. The second one counts duplicates and stores structs of unions because their second assignment motivates them to want to know how to do that. We discover that returning a Boolean may be a good idea for numeric functions where the value to be returned is not guaranteed to be valid. Trees are reserved for the Data Structures course (except for hanging the last student that still wants to return -1.)
Algorithm Efficiency: "Big Oh." Measuring "speed" without running programs. Sample analyses are from set implementations, the two bag programs, and some of the simple sorts.
Pointers, References and Dynamic Data: How to do real damage in C. Little demos including pointing into literal strings and watching three different compilers do three different things: more education in what not to do while programming. What C++ is really up to with references. Dynamic arrays and how to double their size.
Classes with Dynamic Data: Shallow v. Deep Copies. The why and how of copy constructors, destructors, and overloaded assignment. Stacks, Queues and Linked Lists return for the dynamic treatment. For motivation we write (and use) a class for very much improved arrays that are able to blow up real good! We fix the drawback of single types by introducing Templated functions and classes, We learn to paste a lot. This course is for learning how to build containers for ourselves. The start of the Data Structures course explores the STL in some detail.
Simulation: I usually centre the final project around a simulation. There are presentations about their uses, discrete vs. continuous, random numbers and distributions, and a single species discrete event simulation demo program, mostly courtesy of Hugh Dempster, my highly respected professor and colleague at UBC.
Recursive Applications. The students get their second pass across the top of recursive methods. I didn't really plan to teach recursion here, simply to show inspiring recursive applications, because most students used to be jointly enrolled in CPSC1180 using Scheme. We start with bad and pointless examples, before moving on to the useful, winding up looking at the two famous programs that sort by recursively partitioning or merging instead. The course finishes with the recursive Snowplough scheduling algorithm.

I always have a plan to give a brief presentation on inheritance at the end of the course but time has never permitted this.

2.3 Supporting Assignments and Labs

The group labs, as noted below often produce code that contributes towards the final program required for the much more substantial assignments.

I do not mark assignments, but I do mark everything else, and am quite rude about code with unreadable aspects that is turned in labs and I would like to see fixed before it appears in an assignment program.

Lab #1 - An exercise, involving an array, at a level from the prerequisite course.
Lab #2 - Rewrite Lab#1 to follow course style rules, add new features plus create library files with separate compilation for specification/implementation.
Assignment #1 - Design, create and implement an ADT for a given problem. It startles the students that they are not asked to write a main program to solve some problem, but just produce a driver that proves that each operation from the interface is working properly. This assignment has used many base topics, like Sets, Matrices, Graphs, a Text Editor, Battleships, Chess Positions, Word Squares, and so on.

It is important to note that students are asked to produce two different implementations for a part of the problem, most often using a one- and then a two-dimensional array for a part of their data structure. This brings up the issue of a second interface lower down, such that, if they are smart, only two functions need to be rewritten for the changed representation.
Lab #3 - Make the ADT interface for Assignment #1. Labs #3 & #4 take place while Assignment#1 is current.
Lab #4 - Write and test a complete (specified) class related to but simpler than Assignment#1.
Assignment #2 - Write a scheduler. The scheduler is maintained as a linked list, to learn about pointing but without pointer syntax. (Displaying the array in index order is a major debugging tool when pointing goes wrong.) We have scheduled:
(a) Operating System Processes.
(b) Langara Airport Flights.
(c) Hospital Surgical Operations.
(d) Hotel Guest Registration.

The data to be scheduled is quite elaborate (structs of unions of ...) and a large data file of operands and sub-operands that are English words. Now they must write a complete driver to handle a given data file.
Lab #5 - Make the ADT interface for Assignment #2 and a set of very careful diagrams for inserting into and deleting from the scheduler linked list. Labs #5 & #6 take place while Assignment #2 is current.
Lab #6 - Write the command interpreter for Assignment #2 (given a skeleton).
Lab #7 - Implement a specified Queue ADT as a circular array.
Assignment #3 - "Project" Usually a simulation. This assignment involves the invention and use of several interacting classes. Simulations are nice because there are always opportunities for a Queue class, a Random Number Generator class, a Graphics Module class as well as for the main "actors". Sample simulations I have used are:
(a) Traffic Intersection (with or without an advanced left turn lane.)
(b) Bank queuing (with individual teller queues versus a single lineup.)
(c) A Candy Machine (Two different models to compare. Done with multiple products, stacks, breakdowns, etc.)
(d) Elevators (Compare two simple control algorithms.)
(e) Airport Flight Takeoff (Compare adding one more runway.)
(f) Shoot 'em Up Typing Tutor (only when real time constructs are available.)
Lab #8 - Implement a specified Stack ADT using dynamic allocation. Labs #8 & #9 take place while Assignment#3 is current.
Lab #9 - Write a text graphics module to interface with the Assignment#3 simulation (given a skeleton).
Lab #10 - Some recursive procedures to write (lab quiz).

3. Effects of the Course

Students with a C in CPSC1150 just do not pass CPSC1160, but this happens in other combinations of courses too. There may be fewer topics in this course than comparable ones at the same level, but I rarely hear complaints from students having too little to do.

CPSC1160 is quite tough, demanding a commitment to work hard, but students who don't get into the B range are unlikely to succeed in our ACST program. Few students fail because of Langara's final withdrawal day being so late, but many do withdraw, even some who would do well but feel that the grade they were chasing to get back to some university is unlikely to materialise.

The main reason that a presentation on this way of teaching the course may be worthwhile is because of feedback from the students who are successful. As they move on through our and other institutions' programs, they report little further trouble with program design. I apologise for being 100% anecdotal here.

When working in groups in such courses as Software Engineering, they report that they can tell which of their teammates took this version of the course and who did not. This applies even among students who heard the same lecture material but did not tackle similar assignments and their supporting labs.

Ex-Langara Students have written from other institutions noting that they think they have a more thorough background in programming and software design than those institutions' own students.

My personal belief is that students need the time to tackle problems complex enough so that the use of classes with elaborate (dynamic) data is properly motivated. I do not think that this can happen over a week or two and one lab or assignment. I also suspect that without this, students will have weaknesses in program design that will give them persistent problems later.

4. Conclusions

The lectures in CPSC1160 do not talk explicitly about OOP concepts, they are there, but informally. There is enough rigour, however, that when the students come to Assignment #3, they no longer ask which classes they should be creating. Encapsulation, data hiding (such as C++ allows), building abstraction barriers, have become second nature to them. They know how to sit down (together whenever appropriate) to design useful interfaces.

I feel that this comes about by spending a lot more time on the design process. Lab #3, Quiz #1, Midterm Exam #1, Lab #5 and usually the Final Exam all ask students to compose an ADT interface in English, and then gradually as the term goes by, more parts of an implementation are included. Class specification files transfer the original operation descriptions into internal documentation.

Larger assignments seem to be critical in achieving design competence. The group labs let students design and write a more substantial amount of code, as well as get the benefits of consultation with others that can help speed up the solution of problems. This code is included in assignment programs even though these are ultimately individual projects.

I believe there is support in my department and elsewhere for not worrying much about inheritance in an early course. To quote Page-Jones, [6, p.319], "The chief danger of inheritance lies in its overuse or--more precisely--in its misapplication in situations where other object-oriented constructs would be better."

I hope that there is here some food for thought about what it takes to produce people who can program well. At the same time I think the course instills a fair amount of conceptual material, even if subliminally. I have a feeling that this is a realistic approach in a college setting, because we can more closely supervise a lot more of the course activities. I am not sure whether this approach would be as easy to implement in a university setting, especially in programs where the number of credits is being squeezed. I suggest that making do with one fewer programming course can seriously affect the software competence of Computer Science students in the latter years of their programs.

5. References

1. ACM Computing Curricula 2001, Appendix B
http://www.computer.org/education/cc2001/final/index.htm

2. Harold Abelson and Gerald Sussman. Structure and Interpretation of Computer Programs (2nd Ed.) MIT Press, 1996.
http://mitpress.mit.edu/sicp/full-text/book/book.html

3. David Berlinski. The Advent of the Algorithm: The 300-Year Journey from an Idea to the Computer. Harcourt, 2000.

4. Mark Headington & David Riley. Data Abstraction and Structures using C++. Jones and Bartlett, 1997.

5. Michael Main & Walter Savitch. Data Structures and Other Objects Using C++ (2nd Ed.) Addison-Wesley, 2001.

6. Meilor Page-Jones. Fundamentals of Object-Oriented Design in UML. Addison-Wesley, 2000.

6. Acknowledgements

Thanks are due to many conversations about teaching Computer Science that I have had with Vincent Manis over a period of more than 30 years, and with Eliza Kuttner for more than 15, that continue to this day. Also I would like to mention current colleagues at Langara College, in particular Brian Greene and Nalin Wijesinghe, for their ongoing contributions. Finally to UBC which supported good teaching and to Habib Kashani who always went out of his way to encourage it.

Michael Kuttner
Department of Computer Science
Langara College
Vancouver B.C. V5Y 2Z6
Email: mkuttner@langara.bc.ca