1. follow the guide from official site, make sure ssh localhost no password required.
2. Don't run with root user, or you'll get root @localhost password requiring issue.
Note: this one happens if run with sudo ..., not sure if it does on root user.
3. For my machine, current user doesn't have ownership on hadoop\logs, it's resolved by creating new one.
for issue example: chown: changing ownership of `<hadoop\logs>': Operation not permitted
4. Solve "agent admitted failure to sign using the key", using command ssh-add
5. Resovle "org.apache.hadoop.security.AccessControlExeption: permission denied",
I was able to get this working with the following setting:mapred-site.xml
<configuration>
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>/user</value>
</property>
</configuration>
6. to configure hadoop data files path, please add below properties in hdfs-site.xml.
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/administrator/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/administrator/hadoop/filesystem/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/administrator/hadoop/filesystem/data</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
</property>
</configuration>
7. Enable Pseudo distributed system, which looks more sense to simulate reality.
Current state:
Able to run HDFS on Pseudo.
Able to run MR on Pseudo.