Softwares:
Windows 8 Pro 64-bit
Cygwin 1.7.17-1 (1.7.18, using "cygcheck -c cygwin")
JDK 1.7.0_15 64-bit
Hadoop 1.1.1 (1.0.4, bin/hadoop version)
0. Prerequisites
A. Installing Cygwin
When asked to select packages to install, make sure to select "openssh" (also "openssl") in "Net" category. It is required for proper functionality of Hadoop clusters and Eclipse plugin [1].
To save the trouble of configuration, it may also help to run Hadoop in a virtual machine as provided in Yahoo's tutorial [7].
B. Downloading Hadoop
From hadoop.apache.org, download Hadoop, and unpack it, for example, to C:\hadoop\.
Edit the file "conf/hadoop-env.sh" to define at least JAVA_HOME to be the root of the Java installation.
export JAVA_HOME="/cygdrive/c/Program Files/Java/jre7"
(or make a symbol link: ln -s /cygdrive/c/Program\ Files/Java/jdk1.7.0_15 /usr/local/jdk1.7.0_15)
Then, Hadoop can run in a non-distributed mode, or called Local (Standalone) Mode, as a single Java process. Besides, the "grep" example [5] in Hadoop official documents, the following is a "wordcount" example from IBM developerWorks China [2]:
$ cd /cygdrive/c/hadoop/
$ mkdir test-in
$ cd test-in
$ echo "hello world bye world" >file1.txt
$ echo "hello hadoop goodbye hadoop" >file2.txt
$ cd ..
$ bin/hadoop jar hadoop-examples-*.jar wordcount test-in test-out
$ cat test-out/*
In case of the IOException below:
ERROR security.UserGroupInformation: Privi