Logo image
An Alternative View on Data Processing Pipelines from the DOLAP 2019 Perspective
Journal article   Open access   Peer reviewed

An Alternative View on Data Processing Pipelines from the DOLAP 2019 Perspective

Oscar Romero, Robert Wrembel and Il-Yeol Song
Information systems (Oxford), v 92, p101489
01 Sep 2020
url
https://doi.org/10.1016/j.is.2019.101489View
Published, Version of Record (VoR)CC BY-NC-ND V4.0 Open

Abstract

Computer Science Computer Science, Information Systems Science & Technology Technology
Data science requires constructing data processing pipelines (DPPs), which span diverse phases such as data integration, cleaning, pre-processing, and analysis. However, current solutions lack a strong data engineering perspective. As consequence, DPPs are error-prone, inefficient w.r.t. human efforts, and inefficient w.r.t. execution time. We claim that DPP design, development, testing, deployment, and execution should benefit from a standardized DPP architecture and from well-known data engineering solutions. This claim is supported by our experience in real projects and trends in the field, and it opens new paths for research and technology. With this spirit, we outline five research opportunities that represent novel trends towards building DPPs. Finally, we highlight that the best DOLAP 2019 papers selected for the DOLAP 2019 Information Systems Special Issue fall in this category and highlight the relevance of advanced data engineering for data science. (C) 2019 Published by Elsevier Ltd.

Metrics

12 Record Views
7 citations in Scopus

Details

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
International collaboration
Web of Science research areas
Computer Science, Information Systems
Logo image