Critical insight for MAPReduce optimization in Hadoop
In present day scenario cloud has become an inevitable need for majority of IT operational organization s. Cloud applications such as data storage, data retrieval and data portability have become significant requirements for cloud computing. Numerous applications are being developed for BigData. Ach...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Open Science
2014
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/36441/ http://irep.iium.edu.my/36441/ http://irep.iium.edu.my/36441/2/download.pdf |
Summary: | In present day scenario cloud has become an inevitable need for majority of IT operational organization s. Cloud applications such as data storage, data retrieval and data portability have become significant requirements for cloud computing. Numerous applications are being developed for BigData. Achieving an optimal approach for higher performance in terms of efficient load balancing, load distribution, optimum resource utilization, minimum overheads and least possible delay has been the vital issue for cloud infrastructure. Apache Hadoop is one the most used cloud frame work for cloud infrastructure. The predominant philosophy behind Hadoop optimization is the optimization of MapReduce, which is a dominant programming platform effective in bringing a=bout many functional enhancements as per scheduling algorithms developed and implemented. MapReduce has emerged as the most significant part of Hadoop system that establishes itself as a framework that can effectively simplify the overall complexity of running parallel data processes across the network of computing nodes. A number of scheduling techniques have been advocated in the last couple of years for achieving enhanced load balancing in Hadoop. Unfortunately Hadoop still lacks a system model that could facilitate an ultimate solution for delivering optimized performance without creating much computational overhead. In order to pave a way for the development of an adept and decisive load balancing and job scheduling scheme for minimum execution time and optimum resource utilization in future, here in this paper a comprehensive review of some of the major works has been done to discuss the prominence of issues, which will be needed to be taken care of while developing the same. |
---|