SplitStream

Summary

The paper "SplitStream: High-Bandwidth Multicast in Cooperative Environments" by Castro et al. was presented at the SOSP 2003 conference. SplitStream is a method for trying to achieve high bandwidth communications without relying on network infrastructure, as would be required in traditional IP multicasting.

SplitStream is a fully distributed application built on top of Pastry and Scribe. The main concept is to create a forest of standard multicast trees, but with any given host an interior node in only a single tree. This limits the outgoing bandwidth of the host, making it suitable for use in situations where inbound and outband bandwidth is asymmetric. Given that there is no single coordinator of the network it is possible that a SplitStream forest would fail to be built. A significant portion of the paper is given over to a proof that it is highly probable that a forest can be constructed from a suitable number of hosts.

The application is tested in a simulated environment using a packet transport simulator. Cross-talk and packet loss are not modeled in the simulator. A number of different network configurations are tested, including GATech, Mercator, and CorpNet, as well as a configuration based on data collected regarding hosts and bandwidth availability of Gnutella users. The simulations show that even in worst case scenarios, the overhead of the disjoint multicast trees is negligble and that the network restores itself even after loss of as much as 25% of the nodes.

Class Discussion

This paper was presented as part of series of papers on cooperative multicasting. The other systems covered were CoopNet (also from Microsoft) and Bullet. A number of comparisons were drawn between the three systems in the discussion. These are summarized in the list below.

The paper argues that deeps trees introduce delays, and that wide fan-outs require significant outband bandwidth. While this is true of a static situation, the trees are highly dynamic with hosts entering and leaving constantly. So, it is difficult to quantify over a time-interval if this intuitive argument matters.
It is important in decentralized systems to prove that they will actually work. While the simulations show that forests can be constructed, it is important to see the probability work that shows there is a relatively low probability of failure of the network.
Like Bullet, there was extensive simulation of the network application. However, the failure cases that were induced in this paper are dissimilar to the ones in the Bullet paper. As a result it is difficult to compare the two different strategies to arrive at a conclusion as to which is better.
The SplitStream application requires that the data be split into k stripes. This requires that the application using the method employ some time of coding, be it MDC for video or erase coding for other data types. This is a detractor compared to Bullet which has no such requirements.
Although this came from Microsoft Research, it is not known to be currently employed in any of their products.
The paper makes no mention of the underlying transmission protocol. We assume UDP, since TCP would induce much larger delays than are recorded in their simulations.
Bullet employs congestion control, while SplitStream makes no mention of this.
SplitStream employs a spare capacity tree that contains nodes which have not fully utilized their outbound bandwidth. This takes (in Buck's opinion) away from the elegance of the solution, and moves toward a 'hack'. The point of the distributed network is to always find an appropriate neighbor / parent for any given node. When this fails, the node is assigned into the spare capacity tree ... implying that somehow the application failed to meet it's intended purpose of finding an appropriate neighbor / parent.

Reference Material

The original paper that the talk was based on.

The Microsoft Power Point presentation of the paper.

Kevin Loken

Last modified: Sun Feb 13 21:10:03 PST 2005