windows 下的hadoop的安装

最新推荐文章于 2025-04-22 14:54:02 发布

llcode

最新推荐文章于 2025-04-22 14:54:02 发布

阅读量1.3k

点赞数 1

文章标签： windows namenode cygwin hadoop

本文链接：https://blog.csdn.net/llcode/article/details/12684103

版权

1、去网站下载最新的cygwin 版本http://cygwin.com/install.html。

2、下载jdk6及以上版本。

3、下载hadoop的稳定版本。去官网上看。

好的，下载的东西就这么多了。下来开始安装。

一、cygwin的安装。

按照图中说明选择下一步。

需要说明的是：在Root Directory 中的目录，最好不要有空格。这里的路径选择为 D:\SoftInstallProgramFiles\cygwin

在上图所示的对话框中，设置Cygwin安装包存放目录，然后点击“下一步”，进入如上图所示对话框：目录设置为：D:\SoftInstallProgramFiles\cygwin\download

选择下一步。

然后从上图的的链接中选择一个连接，开始下载。如果不行，就换下一个链接。

二、安装hadoop的过程。

需要注意的细节：

1、必须在cygwin中将对hadoop-1.2.1.tar.gz解压，模拟linux下文件解压（tar -zxf hadoop-1.2.1.tar.gz）。解压后文件可以放入到cygwin/opt/中，opt文件夹是新建的（这个依照自己爱好）。

2、接下来，需要修改hadoop的配置文件，它们位于conf子目录下，分别是hadoop-env.sh、core-site.xml、hdfs-site.xml和mapred-site.xml共四个文件。
� 修改hadoop-env.sh
a 、只需要将JAVA_HOME修改成JDK的安装目录即可，请注意JDK必须是1.6或以上版本。

b、设置JDK的安装目录时，路径不能是windows风格的目录(d:\java\jdk1.6.0_13)，而是LINUX风格 (/cygdrive/d/java/jdk1.6.0_13)。

在hadoop-env.sh中设定JDK的安装目录：export JAVA_HOME=/cygdrive/d/java/jdk1.6.0_13
� 修改core-site.xml
为简化core-site.xml配置，将d:cygwin\opt\hadoop-1.2.1\src\core目录下的core-default.xml文件复制到d:cygwin\opt\hadoop-1.2.1\conf目录下，并将core-default.xml文件名改成core-site.xml。修改fs.default.name的值。

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<property>
<name>hadoop.tmp.dir</name>
<value>/opt/tmp/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>

� 修改hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
� 修改mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapred.child.tmp</name>
<value>/opt/temp</value>
<description> To set the value of tmp directory for map and reduce tasks.
If the value is an absolute path, it is directly assigned. Otherwise, it is
prepended with task's working directory. The java tasks are executed with
option -Djava.io.tmpdir='the absolute path of the tmp dir'. Pipes and
streaming are set with environment variable,
TMPDIR='the absolute path of the tmp dir'
</description>
</property>
</configuration>