## Launch IntelliJ
## create Java project - for example "wordcount"
select ‘Create New Project’
-> "Java project"
-> select 1.8 JDK in "Project SDK"
-> Next -> Next
-> enter "wordcount" in "Project name"
-> enter "~/work/wordcount" in "Project location"
-> Finish
## create Java Class
select File -> New -> "Java Class" -> enter "WordCount" -> Create
then a new file generated under src/
## add the Hadoop module dependencies
select File -> "Project Structure" -> Modules -> Dependencies
-> "+" -> "JARS or directories"
Browse to the Hadoop installation, select jar files under following directorys:
share/hadoop/common
share/hadoop/common/lib
share/hadoop/mapreduce
share/hadoop/yarn
## configure
select Run -> "Edit Configuration" -> Application -> "+"
-> enter "WordCount" in "Main class"
-> enter "input output" as input / output path in "Program arguments"
-> Apply -> OK
put the input text under "~/work/mapreduce/input"
## coding and run / debug
Note:
in IntelliJ, WordCount runs with Hadoop standalone mode
don't forget to delete "~/work/mapreduce/output" before running
## create a jar file
$ cd out/production/mapreduce
$ jar cvf ../../../wordcount.jar *.class
## run with Hadoop pseudo-distributed mode
$ hdfs dfs -rm -f -r output
$ hadoop jar wordcount.jar WordCount input output
$ hdfs dfs -cat output/*
reference: https://tokluo.wordpress.com/2016/01/31/using-intellij-to-write-your-application/