Designing a Hadoop Map Reduce Performance Model using Micro Benchmarking Approach


  • Manal Tawalai Alalawi


Hadoop Map Reduce platforms are currently used at an extensive rate to deal with complex data analysis of large size data sets. In MapReduce environments, parallel and distributed processing of big data is done with high energy requirements. Many of the contemporary organisations are looking to reduce the energy requirements of HadoopMapReduce by maintaining the same performance levels. In this paper a platform performance model is proposed specifically for HadoopMapReduce environments in order to improve the energy efficiency of these applications in automatic manner. Unlike the existing performance models, the proposed performance model related the different number of processed data and durations of executed phases that are accomplished through collected measurements from executed sets of micro benchmarking. The resource distribution strategy of this performance model helps to estimate the job completion time on the basis of resource distribution. Mathematical modelling and experiments showed the accuracy of this performance model is improving the energy efficiency of HadoopMapReduce environments.