CpSc 513: Review of CertiKOS paper

Mark's Review of the CertiKOS paper

Citation details:

“Building Certified Concurrent OS Kernels”
R. Gu, Z. Shao, et al.
Communications of the ACM, Vol. 62 No. 10 (Oct. 2019), pages 89-99.

Mark's summary

What problem does the paper address?
An operating system kernel is a critical piece of systems software impacting the correctness, robustness, and performance of applications that run on top of the kernel. Can we use formal verification methods prove that a kernel is correct? In particular, can such methods be applied to an OS kernel that supports fine-grained multi-threading running on a multi-core CPU?
What is the key insight/idea in the paper’s solution to this problem?
The authors promote using compositional techniques including sequential composition, concurrent composition, and multiple layers of refinement. While compositional methods have been known and promoted for 25+ years, the authors claim that recently developed tools, especially the Coq proof assistant and the CertCompX compiler.
What did the authors do to demonstrate their claims?
They sketched the structure of their proofs, describing the main abstraction layers and how they used the compositional methods described above. They report that they have a complete proof and describe some bugs that they found.
Is the support for the claims convincing?
I'll give this a qualified “yes”. My yes answer is that they have an OS that is apparently sufficiently capable that they can use it as a hypervisor and run linux as a guest OS. I'll bring up my concerns in my response to the final question.
Other questions and/or comments
I have several questions that I suspect are pointing to where there is room for future work to get from mc2 to a more capable OS.
- The paper talks about liveness, but their examples rely on having pre-computed bounds on how long a thread holds a lock. It is well-known that this converts the liveness properties into safety properties. That may be necessary, because compositional methods are much better developed for safety properties than liveness properties. For real applications, this may be overly restrictive, in the sense that time-out bounds are always wrong.
- Some key properties such as bounded response times described above and correct use of locks to protect memory accesses seem to rely on the application being correct. What happens if a faulty application is run on the verified operating system?
- Running a linux kernel as a guest OS is impressive. OTOH, if there are several such guest kernels, they are supposed to be isolated from each other. This doesn't seem to exercise the “fine-grain multithreading” features of mc2. They descript some simple experiments to measure lock throughput, and they mention that mc2 is intended for robot control applications. I'd like to see some examples in the intended application space.

Andrew Appel's summary

If you look at the table of contents of the CACM issue that had the CertiKOS paper, you'll see that Andrew Appel wrote a summary of the paper. See: Andrew's summary.

Possible discussion topics

Safety and Liveness properties
Compositional reading:
- Sequential composition, called “horizontal” in the paper.
- Abstraction and refinement, called “vertical” in the paper.
- Concurrent
Memory models and thread safety