hadoop 学习之路- 踩坑记录

最近在学习大数据相关教程,目前自己在本地搭建了 hadoop-ha 集群, hadoop: 2.10.0 , hive: 1.2.2 遇到以下问题,记录下

hadoop 集群设置为ha 高可用后, mr 运行异常的问题

在未设置为ha 时, mr 任务正常调用,设置ha 后, 在运行 hql 时,调用mr时,报错, 查看日志,报错如下:

2020-03-22 00:12:05,577 ERROR [Listener at 0.0.0.0/37442] org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Webapps failed to start. Ignoring for now:
java.lang.NullPointerException
	at org.apache.hadoop.util.StringUtils.join(StringUtils.java:956)
	at org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer.initFilter(AmFilterInitializer.java:74)
	at org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:463)
	at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:409)
	at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:112)
	at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:333)
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.build(WebApps.java:315)
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:401)
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:397)
	at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:143)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1272)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1746)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1742)
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1673)

看日志中中 at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:177)
at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:156) , 百度了下找到了原因, HA机制下yarn-site.xml需要加入以下配置:

	<property>
      <name>yarn.resourcemanager.webapp.address.rm1</name>
      <value>rm1机器名:8088</value>
    </property>

    <property>
      <name>yarn.resourcemanager.webapp.address.rm2</name>
      <value>rm1机器名:8088</value>
    </property>

hiveserver2 启动后,在beeline 中调用mr 时报错

异常信息如下:

java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hdfs is not allowed to impersonate hdfs
	at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:285)
	at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:328)
	at org.apache.hadoop.hive.ql.Context.getMRTmpPath(Context.java:389)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:225)
	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1676)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1435)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1218)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
	at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
	at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
	at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

从异常信息来看,是 hdfs账号不允许假扮hdfs用户, 网上的和官网上原因是 没有在core-site.xml文件中配置账户的proxyuser规则,
官网链接:https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
于是设置以下内容: (我hadoop 集群运行用户是 hadoop 因此我设置的内容如下)

<property>
      <name>hadoop.proxyuser.hadoop(此次设置自己的运行用户).groups</name>
      <value>*</value>
      <description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>
 </property>
 
 <property>
      <name>hadoop.proxyuser.(此次设置自己的运行用户).hosts</name>
      <value>*</value>
      <description>The superuser can connect only from host1 and host2 to impersonate a user</description>
  </property>

更新namenode 用户权限
非ha 使用以下命令:
hdfs dfsadmin –refreshSuperUserGroupsConfiguration

ha使用如下:
hdfs dfsadmin -fs hdfs://namenode节点1 -refreshSuperUserGroupsConfiguration
hdfs dfsadmin -fs hdfs:///namenode节点1 -refreshSuperUserGroupsConfiguration

再更新yarn 集群用户权限
yarn rmadmin -refreshSuperUserGroupsConfiguration

感觉稳稳解决了,马上再试一次, 结果还是不行, 报错依旧啊, 心塞, 这里有一个问题: 我运行的用户是 hadoop ,但是报错中的是 hdfs, 这个是最高权限的用户, 用户不一致,因此从这里入手,在环境变量中 /etc/profile 中设置了 export HADOOP_USER_NAME=hadoop
source 后还是未生效, 因此尝试在 core-site.xml 中增加 hdfs 账户的proxyuser规则,
此时报错终于解决了, 但是又出现 hdfs 文件操作权限问题,这个问题相对就简单很多了, 对报错的文件权限设置为 777 即可, hadoop dfs -chmod -R 777 无权限路径,
这个时候就正常运行了

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值