In this article post, you will learn setting up a basic MapReduce app for WordCount using Maven, Javam and Eclipse and run a basic hadoop MapReduce programming on local mode as it is easy to debug in early stage.
Before starting this project, you need to install JDK 1.6 and set up Eclipse for Maven plugin and download from default Maven repository.
Issue- to count the occurrence of each appearing word inside an input file with MapReduce app.
1. Adding Dependency
Build a Maven project in Eclipse and use this below code in your pom.xml-
2. Mapper Program
This includes tokenizing the file, navigating the words, and emitting a count of one for every appeared word.
Mapper class should be extended by our Mapper class and override it’s map method. After method calling, the value parameter of the method will include a group of lines of file that has to be processed and the output parameter is utilized for emitting of word instances.
The below code will run in real world clustered setup over multiple nodes, which will be consumed by reducers to take app on the next level:
3. Reducer program
The reducer we are using will extend the reducer class and implement logic to total each word token occurrence received from mappers. The result will be appeared inside the Output folder as a text file- part-r-00000 along with a_success file.
4. Driver program
The driver program we are using will configure the Job by offering the map and reduce program we code and various output, input parameters.
You can give it a try and make your own basic MapReduce application program. Once made, do share your feedback. For any other assistance, make comments and Hadoop software applications developers team will get back to you soon.