ISSN

2277 - 3282

e ISSN

2277 - 3290

Publisher

Journal of Science

STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
Author / Afflication
S. Chandra Mouliswaran

School of Information Technology and Engineering, VIT University, Vellore-632014, TamilNadu, India.
Shyam Sathyan

School of Information Technology and Engineering, VIT University, Vellore-632014, TamilNadu, India.
Keywords
Cluster ,HDFS ,MapReduce ,Node Replica ,High Availability ,Distributed Computation ,
Abstract

Hadoop is a software framework that supports data intensive distributed application. Hadoop creates clusters of machine and coordinates the work among them. It include two major component, HDFS (Hadoop Distributed File System) and MapReduce. HDFS is designed to store large amount of data reliably and provide high availability of data to user application running at client. It creates multiple data blocks and store each of the block redundantly across the pool of servers to enable reliable, extreme rapid computation. MapReduce is software framework for the analyzing and transforming a very large data set in to desired output. This paper focus on how the replicas are managed in HDFS for providing high availability of data under extreme computational requirement. Later this paper focus on possible failure that will affect the Hadoop cluster and which are failover mechanism can be deployed for protecting the cluster

Volume / Issue / Year

2 , 2 , 2012

Starting Page No / Endling Page No

65 - 70