【Hadoop】【设置单节点集群】【Setting up a Single Node Cluster】

最新推荐文章于 2024-10-04 18:47:42 发布

资源存储库

最新推荐文章于 2024-10-04 18:47:42 发布

阅读量679

点赞数 18

文章标签： hadoop

本文链接：https://blog.csdn.net/wq6qeg88/article/details/136142227

版权

本文档详述了如何配置单节点Hadoop环境，包括安装必备软件、下载Hadoop、配置参数，以及在独立、伪分布式和全分布式模式下的操作步骤。适用于快速体验Hadoop MapReduce和HDFS的基本功能。

摘要由CSDN通过智能技术生成

4 Prepare to Start the Hadoop Cluster

4 准备启动Hadoop集群

5 Standalone Operation

5 独立操作

6 Pseudo-Distributed Operation

6 伪分布式操作

Configuration

配置

Setup passphraseless ssh

安装无密码ssh

Execution 执行

YARN on a Single Node

单个节点上的YARN

7 Fully-Distributed Operation

7 全分布式操作

1 Purpose 目的

This document describes how to set up and configure a single-node Hadoop installation so that you can quickly perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
本文档介绍如何设置和配置单节点Hadoop安装，以便您可以使用Hadoop MapReduce和Hadoop分布式文件系统（HDFS）快速执行简单操作。

Important: all production Hadoop clusters use Kerberos to authenticate callers and secure access to HDFS data as well as restriction access to computation services (YARN etc.).
重要提示：所有生产Hadoop集群都使用Hadoop来验证调用者和保护对HDFS数据的访问，以及限制对计算服务（YARN等）的访问。

These instructions do not cover integration with any Kerberos services, -everyone bringing up a production cluster should include connecting to their organisation’s Kerberos infrastructure as a key part of the deployment.
这些说明不包括与任何服务的集成，每个提出生产集群的人都应该包括连接到其组织的服务基础设施作为部署的关键部分。

See Security for details on how to secure a cluster.
有关如何保护群集的详细信息，请参阅安全性。

2 Prerequisites 先决条件

Supported Platforms

支持的平台

GNU/Linux is supported as a development and production platform. Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
GNU/Linux作为开发和生产平台受到支持。Hadoop已经在具有2000个节点的GNU/Linux集群上进行了演示。

Required Software

所需软件

Required software for Linux include:
Linux所需的软件包括：

Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
必须安装Java™。推荐的Java版本在Hadoop JavaVersions中描述。
ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons if the optional start and stop scripts are to be used. Additionally, it is recommmended that pdsh also be installed for better ssh resource management.
如果要使用可选的启动和停止脚本，则必须安装ssh并运行sshd才能使用管理远程Hadoop守护进程的Hadoop脚本。此外，还建议安装pdsh，以便更好地管理ssh资源。

Installing Software

安装软件

If your cluster doesn’t have the requisite software you will need to install it.
如果您的集群没有必要的软件，您需要安装它。

For example on Ubuntu Linux:
例如在Ubuntu Linux上：

  $ sudo apt-get install ssh
  $ sudo apt-get install pdsh

3 Download 下载

To get a Hadoop distribution, download a

最低0.47元/天解锁文章

资源存储库

关注

18
点赞
踩
5

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫