Fresh as a ROSE: Java + C in an introductory OS course

Mantis Cheng, Pawel Miskiewicz, Kui Wu and Michael Zastre

Department of Computer Science, University of Victoria

{mcheng, pmiskiew, wkui, zastre}@cs.uvic.ca

 

Abstract

 

The Realtime Java Operating System Emulator (or ROSE) was introduced in September 2002 as a programming platform for our third year course  in Computer Science and Engineering at UVic, “Introduction to Operating Systems”. The emulator system is written in a combination of  Java and C, and therefore leverages our students’ experience with these languages. It consists of a Java VM, a collection of Java native methods, host OS interface, and a collection of Java packages.  The Java VM is based on the simpleRTJ (http://www.rtjcom.com), ROSE provides students with a simple implementation of java.lang. OS services may be written as new native methods in C. Various Java packages (uvic.disk, uvic.net, uvic.posix) have been implemented this way.  Students are required to add new functionality to their own copies of the emulator. ROSE makes available a number of important operating system concepts to be explored in a safe manner by students, including interprocess communication and synchronization, pre-emptive CPU scheduling, disk scheduling and file systems, socket programming. Our students are then able to translate concepts covered in the lectures into running code. We present a report of our experiences to date with our new system.

 

Keywords: software, operating systems, process management, OS emulation

 

           

 

Introduction

 

The Realtime java Operating System Emulator (ROSE) has been used for two semesters in the Department of Computer Science at the University of Victoria. Our report is of a “work in progress” as the authors continue to experiment and develop this system in the classroom.

 

ROSE differs in some significant ways from other pedagogical OS simulators and emulators. We present our motivation for developing ROSE, our experiences after two semesters of using the emulator, and future directions. Our current version works in the “Cygwin” Windows environment and on our Solaris/Intel workstations, requires no extra support from system administrators or laboratory staff, and consumes a little over 3 megabytes of disk storage space per copy.

 

We will first provide a context for the use of ROSE in our curriculum, and then move to a brief overview of ROSE system where some technical details will be discussed. There are many assignments that can be constructed using this platform, but we will restrict ourselves to only two of them for their contrast. Next are our experiences from using this tool along with some student feedback, comparisons between our ROSE system and other existing tools for OS instruction. We close with a description of some future work.

 

 

A Short Word about Pedagogy

 

Instructors who enjoy teaching systems courses (operating systems, compilers, networking, etc.) are often idiosyncratic and prefer teaching the course “their way”. This is nowhere more obvious than when teaching an introductory operating systems course. The following questions are asked time and time again [GOL]: Should students be exposed to OS concepts by using a simulator? Should students be instructed on how to use OS services and limit all course topics to this interface? Or rather, should students be given a real OS kernel and asked to add services or modify existing implementation of services?

 

We feel the separation is more apparent than real and have opted for an approach that mixes together these three. They are:

 

  • Any system used should limit the workload of students. They must become familiar with a non-trivial program in a short period of time. Operating system code is non-trivial to read and understand.
  • Students are best exposed to complex concepts using a programming language familiar to them. The tools used should not be part of the problem.
  • The mental models of an operating system are best derived from experiences in implementing functionality and organization as discussed in the lectures; these implementations must be related to services provided by existing production operating systems.

 

We believe working on both sides of the “OS interface” provides the greatest insight to students as they write user-level programs using the OS-level functionality they themselves have added. This is our primary goal for ROSE.

 

 

Course and Context

 

CSC 360 (“Introduction to Operating Systems”) is a required course for all our Computer Science and Computer Engineering undergraduates at UVic. Prerequisites include CSC 225 (“Data Structures and Algorithms”), CSC 230 (“Computer Organization and Assembly Language”) and SENG 265 (“Introduction to Software Engineering”). The latter course provides, amongst other topics, instruction in Perl scripting, the UNIX environment, and the C programming language. CSC 360 is also a prerequisite for many fourth-year systems programming courses. We can safely assume that most of our students will have some exposure to Java and C programming. This course sequence suggested some constraints for our development of ROSE.

 

The range of topics is typical for an intro OS course – processes, threads, concurrency, scheduling, virtual memory, file systems, security, etc. In the future we expect these topics to change somewhat as courses are introduced as required by our new Bachelor of Software Engineering degree programme. For instance, more networking may be brought into the course, and concurrency topics split into a different course entirely. As we would like ROSE to be used for the next ten years, it must provide some flexibility for introducing new kinds of programming assignments due to changes to the course.

                                                                                        

Given the enrollment pressures at UVic, we now offer the course in each of our three semesters (i.e., year round). Class sizes range from 80 students in each winter term (fall and spring) to 50 students in the summer. There is no formal lab component – this is a significant difference from similar courses at different institutions. Some past offerings of our course used designated drop-in lab space for students completing our assignments, but the labs tend to be small (10 workstations) which may put undesirable pressures on students. Ensuring that ROSE can work on a variety of platforms, especially on equipment in our large drop-in labs, is one important goal. In addition these drop-in lab systems must not be rendered inoperable as a result of students working on our assignments. Ideally students would also be able to work on their own personal systems. Students may work also in pairs, and a web log system is made available to students in order to facilitate cooperative learning.

 

 

Overview of ROSE

 

The system runs as an application in a UNIX environment – at present this must be either Solaris-Intel or Cygwin (i.e., must run on an operating system on top of Pentium hardware). Figure 1 shows the layering of the system along with the directories holding the source code for each layer.

 

                                                                                                        

Figure 1 – Layered view of ROSE

                      

The lowest ROSE layer is a Java VM originally called simpleRTJ (or “simple real-time Java”). It is written in a straight-forward style of C and provides an unoptimized implementation of a JVM. Located at this layer is such functionality as the thread abstraction, thread scheduling, concurrency primitives, network access, heap allocation, garbage collection, and disk simulation. This layer constitutes the “kernel” mode of our emulator.

 

The middle ROSE layer is a collection of Java class files that are part of a “uvic” package. For many system services, the methods in these class files act as interfaces to the native methods. However, some ROSE functionality may be added in this layer in the same spirit as other operating system organizations that allow services to be implemented in user mode. Therefore the two gray layers of the diagram correspond to the set of services offered to the topmost application layer.

 

The application layer is for user-level programs using ROSE services, which are written in Java and compiled using javac. ROSE doesn’t support dynamic class files loading but uses a special linker to bind the compiled application class files into a single ROSE-linked binary (class) file.  The bottom-two ROSE layers are re-built separately from programs in this top level.

 

An example “Hello, World” program is shown below. It involves two threads; their output order is constrained by the semaphores.

 

import uvic.posix.*;

 

class HW extends uvic.posix.Thread

{

        static Semaphore s1, s2;

 

        static void main(String args[])

        {

                System.println("Main");

 

                s1 = new Semaphore (0);  // initialize semaphore to count = 0

                s2 = new Semaphore (0);  // ditto

                HW t1 = new HW(1);

                HW t2 = new HW(2);

                t2.start(uvic.posix.Thread.USER);

                t1.start(uvic.posix.Thread.USER);

        }

 

        int id;

 

        public HW ( int id )

        {

                this.id = id;

        }

 

        public void run()

        {

                String message = Integer.toString(id) + " says hello world!";

                if (id == 1) {

                        System.println (message);

                        s1.Signal();

                        s2.Wait();

                        message = message + " (again)";

                        System.println (message);

                } else if (id == 2) {

                        s1.Wait();

                        System.println (message);

                        s2.Signal();

                } else {

                        System.println("panic!");

                }

        }

}

 

Figure 2 – Simple ROSE program demonstrating semaphore synchronization

 

 

After this program is compiled and linked, the ROSE emulator is invoked on the command line. The command, and the screen which follows, is shown below.

 

> ./rose 1 ../apps/HW.bin

 

Starting ROSE v.1.2 (Quantum = 1 msec) ...

Main

1 says hello world!

2 says hello world!

1 says hello world! (again)

 

ROSE terminated successfully (Time = 19 msec).

 

Figure 3 – HW.java program as executed by emulator

                                        

A few comments about Figures 2 and 3 are needed:

 

  • Concurrency is modeled by threads. A program is a main thread along with the child threads it creates. These threads observe many features of the POSIX model [REF Butenhof].
  • Shared variables amongst threads correspond to those variables declared as static. All other variables will have an instance for each thread – this follows from normal Java semantics.
  • Java as distributed in Sun Microsystem’s SDKs does not include Semaphore functionality. This is provided via the uvic.posix.* package (lib-source/uvic/posix/Semaphore.java); however, this class itself is just a native-method interface. The functionality for Semaphore.Wait() and Semaphore.Signal() is ultimately provided by code in rose/j_thread.c in the C functions SemWait() and SemSignal().
  • Implementations of classes such as System and Integer are contained in lib-source/java/lang. This code, however, is not the same as that provided by Sun’s SDK. For example, a line is printed using System.println(), not by System.out.println(). Further on the integer value of each thread’s “id” instance variable must be first converted to a string using Integer.toString(). Strings in ROSE do not understand the “+” message when the parameter is an integer. This is as distinct from regular Java usage where output would be via System.out.println(id + “says hello world!”).
  • Thread scheduling may occur after each bytecode instruction, which also means after each native method call. Splitting an output statement into two separate statements could yield interleaved lines amongst threads – in System.print(Integer.toString(id)); System.println(“ says hello world!”); there could be a context switch to another thread after the “print” message.
  • The value of the time given as ROSE terminates is wall clock time. This distinguishes ROSE from pure simulators where clock ticks are simulated; in our system, time spent in native methods is accounted for in the reported time. (Note: This the reason for our requirement of Pentium hardware – the time is obtained by reading the processor’s TSC register.)

 

There is a certain amount of detail that must be understood in order for students’ user-level programs to work correctly. As they already know how to read and write Java code, there is no new syntax to learn. However, there are new concepts to master, and the points at which concurrency becomes an issue can be easily understood.

 

At this point we must mentioned our reasons for departing from the standard Java semantics for threads and synchronization. Sun’s Java allows mutex-like locks to be associated with object instances or even instance methods via the synchronized keyword. There is one excellent text which already explores concurrency theory and practice by using “plain-vanilla” Java [MAG]. Why have we decided on a different approach, especially as students cannot transfer their ROSE expertise to other Java projects?

 

The answer is that, in our opinion, a simpler core of primitives is required to teach synchronization than those provided by SDK Java. Relating material from standard Operating Systems texts into SDK Java programs becomes a difficult exercise for students who must map from one barely understood set of concepts to their own shaky knowledge of Java synchronization. This is becoming a pressing issue as textbooks present code examples using a Java-like pseudocode, examples which refer to functionality not present in a typical Java implementation, and adding a new source of confusion for students. With POSIX synchronization primitives our students can then explore the use of these primitives and express plainly express their use. Our aim is to ensure students have good knowledge of concurrency primitives (both of their implementation and use) which in turn is transferable to other languages and systems.

 

 

 

Typical Assignments

 

Adding POSIX mutexes and condition variables

 

Students are first provided with a running system but with empty implementations of mutex and condition variables. However, implementations of semaphore “wait” and “signal” are still available, so there is some guidance on how to block threads, how to place these threads on queue, how to dispatch ready threads, and so on. All of this code is written in C and modifies functions in the lowest ROSE layer.

 

Once the implementation is complete, user-level programs needing these newly completed primitives are written. Any one of the large number of concurrency problems can be posed as problems for this step.

 

From term to term the assignment can be varied. For instance, instead of POSIX condition variables and their Mesa-style semantics, students might be asked to implement Hoare-style semantics [BUT]. POSIX primitives need not be used; reader/writer semaphores, or some other style of semaphores, could be used instead. There are about a dozen or so different concurrency primitives which could be the basis of an assignment, but control over source-code solutions is still especially required (i.e., in order to inhibit plagiarism).

 

void SemWait( int32 id )

{

    semaphore_t *s;

 

    s = SemaphoreOf( id );

    --(s->val);

    if (s->val < 0) {

        thr_active->state = BLOCK_ON_SEM;

        EnQ( &(s->blockQ), thr_active );

          /* select another thread to run */

        Dispatch();

    }

} /* end SemWait */

 

 

 

void SemSignal( int32 id )

{

    semaphore_t *s;

    thread_t *p;

 

    s = SemaphoreOf( id );

    ++(s->val);

    if (s-> val <= 0) { /* someone was waiting! */

        DeQ( &(s->blockQ), &p );

        AddReady( p, false );

        PreemptIfNecessary();

    }

} /* end SemSignal */

 

Figure 5 – rose/j_thread.c code for semaphore Wait and Signal

 

 

A very simple UNIX-like filesystem

 

Students are provided with a system that supports a simulated disk, i.e., reads and writes to the disk are in fixed-sized blocks. An inode-based file system is specified (number of inodes, size of inodes, format of directory blocks, whether single-indirect or double-indirect or both are implemented, free block format, etc.)  The semantics of file reads and writes must be kept straightforward and simple, even if they depart somewhat from UNIX-style semantics. Operations such as “open”, “close”, “read”, “write”, “mkdir”, “rmdir” and “delete” are to be implemented. A detailed reference such as [TAN] is helpfu when constructing such an assignment.

 

As with the previous assignment, the actual filesystem used as a model can be varied from term to term. One might specify a FAT-like filesystem; for another it could be a system with contiguous blocking. The vast research literature on filesystems is an excellent source of ideas.

                                                           

 

Real-time scheduling

 

ROSE also supports preemptive prioritized scheduling and periodic scheduling. In many real time applications, urgency is typically modeled by priority – more urgent threads have higher priority. For example, raw disk block transfer is typically implemented by a higher priority thread; the disk scheduler, on the other hand, runs at a lower priority. Getting data in/out of the disk is more urgent than knowing which disk block to read/write.

 

For real time applications, we often need to poll the hardware periodically, e.g., keyboard and mouse. For playing MP3 audio, uncompressed audio data must be sent to the codec periodically. Typically, a solution to these problems is using the “sleep” system call. “sleep” seldom provides accurate timing; as a result, we have jitter problem. Our periodic scheduling policy allows a thread to be scheduled at a predefined rate, e.g., once every 50 milliseconds.  We are seeing more and more applications that are time sensitive. It is important to educate our students when and how to apply this concept.

 

 

Experiences as a Teaching Tool

 

Reaction from students is – so far – generally positive. Students are evaluated by a mixture of demos to a teaching assistant, by the running of test scripts, and by code inspection. In these demos we have seen students make intriguing leaps of understanding. For instance, one assignment had students implement semaphores supporting the solution to the classic reader/writer problem. While doing this, several groups noticed that the solution for the problem found in nearly all textbooks – one that purports to give readers priority – did not, in fact, give readers priority in some cases (or the meaning of “reader priority” had to be changed to fit the solutions). These insights would have been difficult to attain without “working on both sides of the kernel”. The notion of “fairness” is often overlooked when discussing concurrency; but when it comes to implementation, all is bare to see.

 

Having students program in pairs also had many benefits. Attempts to construct assignments where work can be “split” amongst members does not, in practice, work well. Teams prefer to work together to solve the same problem, with the actual work of coding a solution quite secondary to understanding the nature of a correct solution. Web logs and discussion forums helped exchange amongst teams insights and “gotchas”.

 

Naturally the the majority of the instructor’s work is up front in constructing the assignment, which for the most part means writing up a solution and then subtracting some functionality. As ROSE is a small system, this can be accomplished by a single person over a term, especially if that person is comfortable working in C and Java. Initial exposure to the system takes about a week of reading and coding; thereafter constructing each assignment takes about two days.

 

We decided to use demonstrations as much as possible when marking student submissions. This helped us (1) probe students’ understanding, (2) allowed us to ask them questions about their implementation, and (3) gave us an opportunity to determine if problems building submissions were the results of glitches or of genuinely incorrect solutions. For each assignment and each team, about 20 minutes was required for a demo, which could easily be increased to 30 minutes. There were four assignments per term, and in a class with 40 pairs of students at 30 minutes per demo, the result is 80 hours of work over the term.

 

 

Related Systems

 

We have had some experience over the years with other OS teaching platforms and provide some comparisons with our own platform.

 

Nachos combines a user-level thread package with a MIPS simulator [NAC]. The former is the “kernel”, while the latter allows MIPS programs to be simulated and run as if in “user mode”. The kernel itself is fairly straightforward, albeit a bit difficult to compile depending on the version at hand and the OS/hardware platforms used by students in labs. One interesting feature of Nachos allows students to write their own C programs using a MIPS cross compiler and run them using the kernel. The memory model is straightforward, and implementing virtual memory is easily accomplished as part of an assignment. The result is a very realistic system – issues as discussed in class are reflected in the implementation.

 

One of us (Zastre) did use Nachos for three separate offerings of CSC 360, but finally stopped due to the difficult of constructing new assignments. Nachos is used in many Computer Science departments with active OS research groups, and grad students appear to be used to construct new assignments. Success in Nachos also assumes on the part of the students a high level of competence in the C programming language. However, this is becoming harder to obtain now that Java is the first-year programming language and students otherwise have only one year of C experience. Finally, students cannot run user-level programs until perhaps the third or fourth assignment, and even then they need access to a large and bulky cross-compiler. Working on their home machines is rarely an option for students.

 

OSP is a pure simulator – students implement services within an OS kernel, and then simulated workloads are applied to this implementation [KIF]. We are familiar with the C implementation of OSP, and this was used several years ago at UVic. Our concerns, however, are not with OSP as with the difficulty in relating course concepts to OSP results. Unless one is willing to modify significantly the part of OSP which simulates workloads, then there is the possibility that the same behavior will be seen by students as they implement clearly different algorithms. Unfortunately, the benefits of the simulator’s simplicitly are overshadowed by confusion that results.

 

 

Future Directions

 

ROSE is a work in progress. We would like to use it for many years in our introductory OS course – or at least until C and Java are no longer feasible languages for such work. Some remaining development items are listed below.

 

Not quite open source

 

Our system has been made possible by the kind permission of the creators of simpleRTJ (www.rtjcom.com). Their JVM and class files are cleanroom implementations (i.e., not subject to Sun copyrights). We have source code for all except the linker – for that we only have a JAR file. An open-source implementation for the next version of the emulator is made feasible by the availability of open-source JVMs and classfile implementations

 

Configuring for assignments requires work

                                           

Version management for instructors creating assignments is still somewhat problematic. There is always the use of CVS, but we would like to construct assignments that are “mixins” of others. For that we will need to use CVS plus some supporting scripts.

 

Linker crankiness

 

The RTJ linker resolves native method calls in Java classfiles to their stubs in rose/j_lang.c. This means that addition or removal of native methods results in changes to two files in separate directories. Not only must changes be synchronized, but the absolute order of native method entries listed in each file must be identical with the other. This must be done exactly right, especially when rolling out various assignments which include different functionality. One solution is to use a single XML file to represent assignment information and then generate the control information, Java files and C code via scripts.

More ambitious assignments for other subsequent courses

 

Our version of the emulator does support network communication, and we would like to use the platform in distributed systems courses. One possible assignment would implement, for example, a peer-to-peer file-sharing system, or a remote procedure call (RPC) mechanism, or one of various transaction protocols. Given a small enough JVM, the system could also be used for real-time OS courses which utilize robots as an implementation platform.

 

 

Conclusions

 

An operating system is one of the most complicated pieces of software running on any hardware. It must deal with the following issues: concurrency, resource allocation, performance and reliability, networking and security, etc. The design and structure of an operating system is an endless source of challenging problems in systems software engineering. As instructors, we are often faced with the difficulties of teaching “real” world problems to students and bringing theory to practice into a classroom. We would like students to learn important concepts without being overwhelmed by the complexity. Therefore teaching Operating Systems can be is a daunting task. The topic is so well-known. Most students and instructors knows what a real OS is like. However, few teachers dare to use real OS in a classroom for the reason that they are often far too complicated to learn important concepts. A pure OS simulator could only deal with quantitative aspect of the theory, but couldn’t relate to the intricacy of OS architecture and design. We designed and implemented ROSE to bring the OS theory and practice closer together in a pedagogical setting. Its effectiveness as a teaching tool is measured by how well students can learn the complex OS concepts in a relatively short 10 weeks time.

 

 

References

 

 

[BUT]  Butenhof, D. Programming with POSIX Threads. Addison-Wesley, 1997.

 

[GOL]  Goldweber, M., Barr, J., Camp, T., Grahm, J., and Hartley, S. A comparison of operating systems courseware. In Proceedings of of the thirtieth SIGCSE technical symposium on Computer Science education (1999), pp. 348—349.

 

[KIF]   Kifer, M., and Smolka, S. OSP: An Environment for Operating System Projects. Addison-Wesley, 1991.

 

[MAG] Magee, J., and Kramer, J. Concurrency: State Models & Java Programs. John Wiley & Sons, 1999.

 

[NAC] Nachos, April 2003, http://www.cs.washington.edu/homes/tom/nachos/.

 

[SIL]    Silberschatz, A. Galvin, and P., Gagne, G. Operating Systems Concepts (6th edition). John Wiley & Sons, 2001.

 

[TAN]  Tanenbaum, A., and Woodhull, A. Operating Systems: Design and Implementation (2nd edition). Prentice-Hall, Upper Saddle River, NJ, 1997.