![](https://img-blog.csdnimg.cn/20210725181518957.png?x-oss-process=image/resize,m_fixed,h_224,w_224)
Apache Hadoop
文章平均质量分 67
The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.
lucklilili
这个作者很懒,什么都没留下…
展开
-
Hadoop 50070端口无法访问
通过JPS指令查看启动的JAVA进程,发现NameNode进程不存在。通过hdfs namenode -format 格式化NameNode。hdfs namenode -format执行start指令,发现NameNode进程已经存在。start-all.sh通过浏览器访问50070端口。...原创 2022-03-08 10:50:26 · 2165 阅读 · 0 评论 -
Hadoop yarn 调度器与算法
Hadoop 作业调度器主要有三种:FIFO(First In First Out)、容量(Capacity Scheduler)和公平(Fair Scheduler)。Apache Hadoop3.1.3 默认的资源调度器是 Capacity Scheduler。 Hadoop: First In First Out SchedulerFIFO 调度器():单队列,根据提交作业的先后顺序,先来先服务。优点:简单易懂; 缺点:不支持多队列,生产环境很少使用; Hadoop:.原创 2021-08-24 19:25:05 · 285 阅读 · 0 评论 -
Hadoop Yarn
The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is eith原创 2021-08-23 13:45:35 · 157 阅读 · 0 评论 -
Hadoop MapReduce 入门
Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.A MapRedu翻译 2021-08-17 20:20:23 · 70 阅读 · 0 评论 -
Hadoop HDFS DataNode机制
一个数据块在 DataNode 上以文件形式存储在磁盘上,包括两个文件,一个是数据本身,一个是元数据包括数据块的长度,块数据的校验和,以及时间戳。 (2)DataNode 启动后向 NameNode 注册,通过后,周期性(6 小时)的向 NameNode 上报所有的块信息。DN 向 NN 汇报当前解读信息的时间间隔,默认 6 小时。<property> <name>dfs.blockreport.intervalMsec</name> ...原创 2021-08-17 18:58:07 · 102 阅读 · 0 评论 -
Hadoop HDFS 副本机制
Data ReplicationHDFS is designed to reliably store very large files across machines in a large cluster. It stores each file as a sequence of blocks. The blocks of a file are replicated for fault tolerance. The block size and replication factor are con...原创 2021-08-16 20:59:03 · 3103 阅读 · 0 评论 -
Hadoop HDFS 网络拓扑-节点距离计算
节点距离:两个节点到达最近的共同祖先的距离总和。原创 2021-08-16 20:53:27 · 207 阅读 · 0 评论 -
Hadoop HDFS 读写流程
客户端通过 Distributed FileSystem 模块向 NameNode 请求上传文件,NameNode 检查目标文件是否已存在,父目录是否存在。 NameNode 返回是否可以上传。 客户端请求第一个 Block 上传到哪几个 DataNode 服务器上。 NameNode 返回 3 个 DataNode 节点,分别为 dn1、dn2、dn3。 (5)客户端通过 FSDataOutputStream 模块请求 dn1 上传数据,dn1 收到请求会继续调用dn2,然后 dn2 调用 ...原创 2021-08-16 20:30:20 · 293 阅读 · 0 评论 -
Hadoop 3.0.0 New features
Hadoop3.X AndHadoop2.X VersionDiffcomparisonand New features。Minimum required Java version increased from Java 7 to Java 8All Hadoop JARs are now compiled targeting a runtime version of Java 8. Users still using Java 7 or below must upgrade to Java...原创 2021-07-25 12:09:26 · 1700 阅读 · 5 评论