Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database

Christopher Yang; Christine Yen; Ceryen Tan; Samuel R Madden

doi:10.1109/ICDE.2010.5447913

Back

Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database

Conference proceeding

Open access

Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database

Christopher Yang, Christine Yen, Ceryen Tan and Samuel R Madden

2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pp 657-668

Mar 2010

DOI: https://doi.org/10.1109/ICDE.2010.5447913

Featured in Collection : UN Sustainable Development Goals @ Drexel

Files and links (1)

url

http://hdl.handle.net/1721.1/59970View

Published, Version of Record (VoR) Open CC BY-NC V4.0

Abstract

Computer crashes

Data warehouses

Database systems

Discrete transforms

Distributed databases

Failure analysis

Fault tolerance

Fault tolerant systems

Load management

Middleware

In this paper, we describe a scheme for tolerating and recovering from mid-query faults in a distributed shared nothing database. Rather than aborting and restarting queries, our system, Osprey, divides running queries into subqueries, and replicates data such that each subquery can be rerun on a different node if the node initially responsible fails or returns too slowly. Our approach is inspired by the fault tolerance properties of Map Reduce, in which map or reduce jobs are greedily assigned to workers, and failed jobs are rerun on other workers. Osprey is implemented using a middleware approach, with only a small amount of custom code to handle cluster coordination. Each node in the system is a discrete database system running on a separate machine. Data, in the form of tables, is partitioned amongst database nodes and each partition is replicated on several nodes, using a technique called chained declustering [1]. A coordinator machine acts as a standard SQL interface to users; it transforms an input SQL query into a set of subqueries that are then executed on the nodes. Each subquery represents only a small fraction of the total execution of the query; worker nodes are assigned a new subquery as they finish their current one. In this greedy-approach, the amount of work lost due to node failure is small (at most one subquery's work), and the system is automatically load balanced, because slow nodes will be assigned fewer subqueries. We demonstrate Osprey's viability as a distributed system for a small data warehouse data set and workload. Our experiments show that the overhead introduced by the middleware is small compared to the workload, and that the system shows promising load balancing and fault tolerance properties.

Metrics

12 Record Views

33 citations in Web of Science

58 citations in Scopus

See more details

Details

Title: Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database
Creators: Christopher Yang - CSAIL, MIT, Cambridge, MA, USA
Christine Yen - CSAIL, MIT, Cambridge, MA, USA
Ceryen Tan - CSAIL, MIT, Cambridge, MA, USA
Samuel R Madden - CSAIL, MIT, Cambridge, MA, USA
Publication Details: 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pp 657-668
Publisher: IEEE
Resource Type: Conference proceeding
Language: English
Academic Unit: Information Science
Web of Science ID: WOS:000286933100069
Scopus ID: 2-s2.0-77952779172
Other Identifier: 991021855281504721

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#9 Industry, Innovation and Infrastructure

Source: SDGs in the Output

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas: Computer Science, Theory & Methods; Engineering, Electrical & Electronic

Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database

Files and links (1)

Abstract

Metrics

Details

UN Sustainable Development Goals (SDGs)

InCites Highlights

Drexel University Social media