October 7, 2003
The increasingly commodity nature of storage and our insatiable tendency to produce, store, and use large amounts of data exacerbates the problem of ensuring data survivability. The advent of large robust networks has gained the idea of replicating data on remote hosts wide-spread acceptance. Unfortunately, the growth of network bandwidth is far outstripped by both the growth of both storage capacity  and our ability to fill it. Thus, most replication systems, which traditionally replicate data blindly, fail under the onslaught of this lopsided mismatch. We propose a Policy Driven Replication (PDR) system that prioritizes the replication of data, based on user-defined policies that specify which data is to be protected, from which failures, and to what extent. By prioritizing which data is replicated, our system conserves limited resources and ensures that data which is deemed most important to and by the user is protected from failures that are deemed most likely to occur.