A map-reduce framework is popular for big data analysis. In the typical map-reduce framework, both master node and worker nodes can use hard-disk drives (HDDs) as local disks for the map-reduce computation. However, because of the inherit mechanical problems of HDDs, the I/O performance is a bottleneck for the map-reduce framework when I/O-intensive applications (e.g., sorting) are performed. Replacing HDDs with solid-state drives (SSDs) is not economical, although SSDs have better performance than HDDs. In this paper, we propose a virtualization-based hybrid storage system for the map-reduce framework. The objective of the paper is to combine the advantages of the fast access property of SSDs and the low cost of HDDs by realizing an economical design and improving I/O performance of a map-reduce framework in a virtualization environment. We propose three storage combinations: SSD-based, HDD-based, and a hybrid of SSD-based and HDD-based storage systems which balances speed, capacity, and lifetime. According to experiments, the hybrid of SSD-based and HDD-based storage systems offers superior performance and economy.
Aseffa DEREJE TEKILU
National Taiwan University of Science and Technology
Chin-Hsien WU
National Taiwan University of Science and Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Aseffa DEREJE TEKILU, Chin-Hsien WU, "A Virtualization-Based Hybrid Storage System for a Map-Reduce Framework" in IEICE TRANSACTIONS on Information,
vol. E99-D, no. 9, pp. 2248-2258, September 2016, doi: 10.1587/transinf.2015EDP7365.
Abstract: A map-reduce framework is popular for big data analysis. In the typical map-reduce framework, both master node and worker nodes can use hard-disk drives (HDDs) as local disks for the map-reduce computation. However, because of the inherit mechanical problems of HDDs, the I/O performance is a bottleneck for the map-reduce framework when I/O-intensive applications (e.g., sorting) are performed. Replacing HDDs with solid-state drives (SSDs) is not economical, although SSDs have better performance than HDDs. In this paper, we propose a virtualization-based hybrid storage system for the map-reduce framework. The objective of the paper is to combine the advantages of the fast access property of SSDs and the low cost of HDDs by realizing an economical design and improving I/O performance of a map-reduce framework in a virtualization environment. We propose three storage combinations: SSD-based, HDD-based, and a hybrid of SSD-based and HDD-based storage systems which balances speed, capacity, and lifetime. According to experiments, the hybrid of SSD-based and HDD-based storage systems offers superior performance and economy.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.2015EDP7365/_p
Copy
@ARTICLE{e99-d_9_2248,
author={Aseffa DEREJE TEKILU, Chin-Hsien WU, },
journal={IEICE TRANSACTIONS on Information},
title={A Virtualization-Based Hybrid Storage System for a Map-Reduce Framework},
year={2016},
volume={E99-D},
number={9},
pages={2248-2258},
abstract={A map-reduce framework is popular for big data analysis. In the typical map-reduce framework, both master node and worker nodes can use hard-disk drives (HDDs) as local disks for the map-reduce computation. However, because of the inherit mechanical problems of HDDs, the I/O performance is a bottleneck for the map-reduce framework when I/O-intensive applications (e.g., sorting) are performed. Replacing HDDs with solid-state drives (SSDs) is not economical, although SSDs have better performance than HDDs. In this paper, we propose a virtualization-based hybrid storage system for the map-reduce framework. The objective of the paper is to combine the advantages of the fast access property of SSDs and the low cost of HDDs by realizing an economical design and improving I/O performance of a map-reduce framework in a virtualization environment. We propose three storage combinations: SSD-based, HDD-based, and a hybrid of SSD-based and HDD-based storage systems which balances speed, capacity, and lifetime. According to experiments, the hybrid of SSD-based and HDD-based storage systems offers superior performance and economy.},
keywords={},
doi={10.1587/transinf.2015EDP7365},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - A Virtualization-Based Hybrid Storage System for a Map-Reduce Framework
T2 - IEICE TRANSACTIONS on Information
SP - 2248
EP - 2258
AU - Aseffa DEREJE TEKILU
AU - Chin-Hsien WU
PY - 2016
DO - 10.1587/transinf.2015EDP7365
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2016
AB - A map-reduce framework is popular for big data analysis. In the typical map-reduce framework, both master node and worker nodes can use hard-disk drives (HDDs) as local disks for the map-reduce computation. However, because of the inherit mechanical problems of HDDs, the I/O performance is a bottleneck for the map-reduce framework when I/O-intensive applications (e.g., sorting) are performed. Replacing HDDs with solid-state drives (SSDs) is not economical, although SSDs have better performance than HDDs. In this paper, we propose a virtualization-based hybrid storage system for the map-reduce framework. The objective of the paper is to combine the advantages of the fast access property of SSDs and the low cost of HDDs by realizing an economical design and improving I/O performance of a map-reduce framework in a virtualization environment. We propose three storage combinations: SSD-based, HDD-based, and a hybrid of SSD-based and HDD-based storage systems which balances speed, capacity, and lifetime. According to experiments, the hybrid of SSD-based and HDD-based storage systems offers superior performance and economy.
ER -