Apache Hadoop 1.0 Released

By Kshitij Sobti | Updated 5 Jan 2012
Apache Hadoop 1.0 Released

Apache Hadoop, the open source framework for creating distributed applications has just been updated to version 1.0.

There is only so much power you can put into one single system, and thus most popular web applications today require multiple servers to handle the kinds of workload that they recieve.

advertisements

How then are these large volumes of data stored, managed, processed, and queried? Well one answer is Apache Hadoop. Apache Hadoop provides a Java-based software platform for building highly scalable distributed applications, and storage. It can run on easily available commodity hardware.

Apache Hadoop provides two main functionalities, a distributed filesystem (HDFS), and a Map/Reduce computation system for processing data. HDFS stores data is a redundant way such that the data is distributted in 64MB chunks across the network of storage nodes, with each chunk stored in three different places. HDFS is aware of the proximity and location of each data piece so it can rout things for best efficiency.

The Map/Reduce model of Apache Hadoop can be used for computing on large amounts of data by breaking the task into small units of work that can be done in parallel.

If you are interested in Apache Hadoop, you can visit the project's wiki to find out more about it; you can see what's new in the latest 1.0 release here; and can download the project from here.

advertisements

 

We'd like you to be a part of us, join us on Facebook by clicking on Like on our Facebook page at facebook.com/devworx.in.

advertisements
Kshitij Sobti
advertisements
ASK DIGIT

Recent Questions

hadoop
t ruth pushpalatha
Sept 3, 2014
Responses
Comments
Be the first one to post the comment
Post a New Comment
You must be signed in to post a comment
advertisements