International Journal of Advances in Electronics and Computer Science ( IJAECS )
A highly rated peer reviewed monthly International Journal
Editor-in-Chief : Dr. P. Suresh
Contact Person : Technical Editor
Contact Mail : [email protected]  
Current Issue : Volume-11,Issue-2  ( Feb, 2024 ) View More
Journal Impact Factor : 2.68 View More

Journal Info
Publisher:IRAJ
ISSN (p): 2394-2835
Issues /Year :12
Stay up-to-date
Register your interests and receive email alerts tailored to your needs
Follow us
facebook twitter linked in

Paper Detail


Paper Title
Index based Approach in Hadoop Ecosystem for Performance Improvement

Abstract
In today's world, the term BIG DATA is not a new thing to most of the professionals and academicians. One possible definition of BIG DATA is "The Data which is huge in size and beyond the processing capacity of a single or bunch of computers is called BIG DATA". The two important aspect of this BIG DATA is- Storing and Processing of data. We can also realize this BIG DATA as a problem to us. On other hand, we have Apache Hadoop as a solution to Big Data problem. Hadoop is an open source framework owned by Apache Software foundation for Storing and Processing the large dataset but not suitable or recommended for small dataset. In Hadoop, we have HDFS (Hadoop Distributed File System) for storage purpose and MapReduce for processing purpose as two main components of it. HDFS is a special designed file system for storing the large datasets with cluster of commodity hardware's with streaming access pattern while MapReduce is responsible for parallel processing on stored datasets in HDFS. To Search any specific data in Hadoop, we have to go through all the data blocks available in HDFS via MapReduce program as Hadoop stores the entire dataset in form of Data Blocks in different DataNodes available in cluster. This paper deals with the strategy to Search a specific data in Hadoop in minimal time. For this, we introduce a new index approach in Hadoop EcoSystem by which we only need to go through those data blocks where the desired data is available not all data blocks. Keywords - Apache Hadoop EcoSystem, HDFS, Indexing in HDFS, InputSplits.


Author - Ashish Singh Parihar, Swarnendu Kumar Chakraborty

Published : Volume-6,Issue-6  ( Jun, 2019 )


DOIONLINE Number - IJAECS-IRAJ-DOIONLINE-15635   View Here

| PDF |
Viewed - 61
| Published on 2019-08-13
   
   
PAST ISSUES
Volume-11,Issue-1  ( Jan, 2024 )
Volume-10,Issue-12  ( Dec, 2023 )
Volume-10,Issue-11  ( Nov, 2023 )
Volume-10,Issue-10  ( Oct, 2023 )
Volume-10,Issue-9  ( Sep, 2023 )
Volume-10,Issue-8  ( Aug, 2023 )
Volume-10,Issue-7  ( Jul, 2023 )
Volume-10,Issue-6  ( Jun, 2023 )
Volume-10,Issue-5  ( May, 2023 )
Volume-10,Issue-4  ( Apr, 2023 )
Journal Indexed